Oct
27
2011

F# as a Octave/Matlab replacement for Machine Learning

Back when I was in college, I took three different courses that dealt with subjects related to machine learning and data mining. Although I didn’t lose interest on those matters, my work has led me in a totally unrelated direction, so I haven’t exercised any of that knowledge in about eight years or so. A few weeks ago, I stumbled upon Stanford’s online class on Machine Learning and decided to enroll. I want to revive many of the things I have forgotten and try to put them into practice, as nowadays it’s very easy to access large amounts of interesting data from all kinds of online sources.

The programming exercises of this class are supposed to be done in Octave or Matlab, and while I understand the advantages of these tools, my past experience (where all the exercises and projects were done either with SAS or with Matlab) shows me that not using a general purpose programming languages doesn’t help a lot in turning academic exercises into real world programs. As professor Andrew Ng said in the introduction, one of the goals of the class is for us to put machine learning into practice in real world problems we care about, so I decided that I’ll implement all the algorithms and exercises in F#.

More...

Oct
2
2011

Introducing PreSharp

Background

Back in 2004, I was doing some code-generation work as part of the OBIWAN project. When I started, CodeDom was being used to do the work, but I really didn’t like it because it made the generator code very hard to read and modify. Realistically, I would not need to support any other language than C#, so I started looking for alternatives. CodeSmith was very popular at the time for generating type-safe collections (.NET 2.0 generics didn’t exist yet), but it was targeted at one-shot generations, and not at creating code generation code. Then I found a very simple tool named CodeGen that appealed to me. I had been playing around with the Boost Preprocessor library recently, so I really liked the idea of using the preprocessor. I did a few tweaks to it and was able to use it for my needs at the time. Later on, around mid-2007, I needed to do code-generation again, so I took this tool and added a good amount of more power to it. At this time, it was very far apart from the original code, so I re-baptized it as PreSharp and published it to CodePlex. I never got around to do any documentation for it, so I’m making this post to try to compensate for that. I also moved the project recently from CodePlex to GitHub.

More...
Sep
27
2011

Microsoft: please eat some more of your WPF-flavored dog food

I have  a love-hate relationship with WPF.

I think it is several orders of magnitude better than Windows Forms and gives developers a tremendous expressive power to materialize their ideas into good user interfaces. I wouldn’t dream of doing many of the things I was able to do in just a few weeks in the Agile Platform IDE in any other UI framework, even if I was given months. It’s like going from plain C to C# 4.

But even though the concept is really good, the quality and completeness of the concrete implementation are really not there. There are just too many quirks, weird limitations, parts that don’t play nice with others, performance problems, memory leaks, etc... And when things don’t work as they should, many times you really have to go out of your way to be able to fix things.

I understand deadlines and priorities, and I know that probably Microsoft just had to ship something at some point, but it really seems that there was a big lack of dogfooding in the WPF case.

There’s a striking example of this: what was the number one complaint that developers had about WPF since 2006?… Blurry text and images. And when did Microsoft fix it?… Only in 2010, when they started using WPF for Visual Studio.

Another issue that has been bugging me since I started using WPF was the airspace limitation. It seems that it’s finally going to be fixed in 4.5. Why do I think it’s being solved now? Because they probably needed some native WinRT component to play nice with WPF…

I think eating you own dog food is really important. At OutSystems we use our product to build all web applications: the R&D apps (bug tracking, project management, continuous integration…), the HR apps (directory, vacations, recruiting…), the sales apps, the marketing apps, the corporate site, the community forums, etc… And I can say with a great deal of confidence that if we didn’t do that, our product wouldn’t be half of what it is today.

Bottom line: we as developers should always try to eat our own dog food as much as we can. In fact, I think it’s so important that it should have been item 13 in the Joel Test.

Sep
25
2011

Migrating from SubText to BlogEngine.NET

My blog was running in SubText, but I’ve recently changed to BlogEngine.NET. The main reason for the change was that BlogEngine.NET supports storing the data in plain XML files, without having any dependency to a database. This makes it really easy to customize, as I can develop everything in my local machine, and then just send the files to the server. With SubText this was harder to do, so I ended up doing fewer tweaks, and doing them directly in production, which I didn’t like.

The migration wasn’t as smooth as I though it would be, so here are the steps I took, in case someone needs to go through a similar process:

  • Exporting to BlogML and then importing didn’t work correctly. After exporting, I had to convert the content from Base64, replace the &’s in the titles with &, and then manually set the author in all post. I also had to set the tags manually in all posts, as in SubText they weren’t a separate field but instead were part of the content of the posts. Having the posts on xml files on disk rather than on a database eased this job a lot.
  • The TinyMCE bundled in BlogEngine.NET doesn’t support images, so I had to replace it with CKEditor and CKFinder.
  • The SubText post URL's have the format /archive/year/month/day/name-of-the-post.aspx, while BlogEngine.NET post URL's have the format /server/post/year/month/day/name-of-the-post.aspx. I had to set up redirection so the old links would keep working. To do that I’ve taken the source code from BlogEngine SEO Redirection extension and changed the comparison to only check prefixes (source code here).
  • Several of the extensions available out there don’t work out of the box with the latest version of BlogEngine.NET (2.5):
    • For the AddThis extension I had to add some missing using’s to the code, and disable the form validation of the settings page (source code here).
    • To be able to configure widgets in the AllTuts theme, I had to make a small bug fix (as described in Janier Davila comment here)

After all that work, now I have a setup that allows me to tweak the blog very easily. I have it on GitHub, cloned into my local machine and into the server. I can play around locally, and when I’m glad with the results I just need to do a git push locally and then remote desktop to the server to do a git pull and force a reload of the application by touching the web.config.

Of course, to protect my passwords and keys, I have to tell git to ignore users.xml, akismetfilter.xml and recaptcha.xml, using this procedure. And I also must take care with the email settings that are stored on the global settings.xml file, which forces me to always have that file uncommitted at the server, but other than that, it’s orders of magnitude better than editing directly on production.

Apr
19
2011

Replacing Visual Studio 2010 with SharpDevelop 4.1

I use Visual Studio 2010 on a daily basis, and I must say that the overall experience is far from good:

  • The find dialog has some really weird and annoying resizing issues.
  • Sometimes find text simply stops working, never returning any result. Even worst, sometimes it starts returning only the first hit on each file, which of course you only notice the next day because of the bug you introduced in the code base because of that.
  • It has the habit of recompiling more projects than it really needs to.
  • Sometimes the simple action of clicking on a text editor to change the caret position stops working, and we can only use the keyboard for that until you restart Visual Studio.
  • It takes forever to open XAML files (even with the option to always open in full XAML view, not design mode).
  • It has some serious file cache problems when using custom MSBuild build steps that generate files for compilation. If during a build a generated file is changed while it’s opened, the result of the compilation will be as if the file hadn’t been changed. If that file is not opened, or we use the MSBuild command line, it works correctly. Just imagine going through a huge list of steps to reproduce some problem only to find out that you’re debugging outdated code. It’s a real nightmare.

So I decided to give SharpDeveloper a try, and I was pleasantly surprised:

  • It’s pretty fast and responsive, even with big solutions, as long as you turn off source control integration.
  • I can open .xaml files as fast as .cs files, and the xaml auto complete works sufficiently good.
  • It builds the big solution I use daily in half the time (literally)
  • It doesn’t have the file cache problem
  • It has an option for building modified projects only, ignoring the dependencies
  • The latest 4.1 alpha builds allow to see the metadata of referenced types like Visual Studio does.

Of course, there are also a few downsides:

  • The usability of the editor auto complete is noticeably lower than Visual Studio IntelliSense.
  • The shortcut keys aren’t customizable, which is a big no no considering that the basic ones like build and toggle breakpoint are different from the ones we’re used to from Visual Studio.
  • Custom debugger visualizers and other very useful add-ins  for Visual Studio don’t have a SharpDevelop version.
  • It doesn’t have Edit and Continue.

I had already tried SharpDevelop before in the past, but at the time it seemed to be too limited when compared to Visual Studio. I now think it’s good enough to be usable, and as it’s open source and the code base looks readable, I’m going to try to fix a few quirks to be able to completely replace Visual Studio for my daily coding needs.

The first thing I did was to create a SharpDevelop add-in with equivalent functionality to the DPack File Browser, which is one of the Visual Studio add-ins that I use the most. I got a working prototype really fast, as the SharpDevelop add-in API is much cleaner that the Visual Studio EnvDTE mess:

I posted the source code to https://github.com/ovatsus/CodeBeside.FileBrowser

About the author

  Gustavo Guerra
  London, UK

  Software Developer
  interested in Functional Programming, Artificial Intelligence, and User Experience Design

Archives



Recent Comments

Powered by Disqus