One of the things to keep in mind when learning a new programming language is that it isn’t enough to learn the syntax and semantics of various language features you are unfamiliar with. Just as important is learning the idioms and way of thinking that goes with these language features. On reddit, ubernostrum pointed out that my usage of

if all_links.get(url) == None:

was jarring to read as a Python programmer when compared to the more idiomatic

if url not in all_links:

Of course this is just a stylistic issue but his point is valid. A similar thing happened with regards to other aspects of my recent post entitled Does C# 3.0 Beat Dynamic Languages at their Own Game? 

I argued that type inferencing and anonymous types in C# 3.0 did not offer the same degree of functionality that tuples and dynamic typing did when it came to processing intermediate values in a computation without requiring nominal types (i.e. named classes) to hold these values.

Specifically I had the following IronPython code,

IronPython Code

      for item in filteredItems:
            vote = (voteFunc(item), item, feedTitle)

            #add a vote for each of the URLs
            for url in item.outgoing_links.Keys:
                if all_links.get(url) is None:
                    all_links[url] = []
                all_links.get(url).append(vote)

    # tally the votes, only 1 vote counts per feed
    weighted_links = []
    for link, votes in all_links.items():
        site = {}
        for weight, item, feedTitle in votes:
            site[feedTitle] = min(site.get(feedTitle,1), weight)
        weighted_links.append((sum(site.values()), link))
    weighted_links.sort()
    weighted_links.reverse()

The key things to note about the above code block are (i) the variable named vote is a tuple of three values; the numeric weight given to a link received from a particular RSS item, an RSS item and the title of the feed Python and (ii) the items in the tuple can be unpacked into individual variables when looping over the contents of the tuple in a for loop.

When I tried to write the same code in C# 3.0 with a vote variable that was an anonymous type, I hit a road block. When I placed instances of the anonymous type in the list, I had no way of knowing what the data type of the object I’d be pulling out of the list would be when I wanted to extract it later to tally the votes. Since C# is statically typed, knowing the type’s name is a requirement for retrieving the objects from the list later unless I planned to interact with them as instances of System.Object and access their fields through reflection (or something just as weird).

So in my C# 3.0 solution I ended up creating RankedLink and Vote types to simulate the functionality I was getting from tuples in Python.

However it turns out I was using anonymous types incorrectly. I tried to take a feature that was meant to be coupled with C# 3.0’s declarative Language Integrated Query (LINQ) and use it in the traditional imperative loop constructs I’ve been familiar with since my days programming in C.

Ian Griffith’s set me straight with his blog post entitled Dare Obasanjo on C# Anonymous Types where he showed how to use anonymous types to get the solution I wanted without having to create unnecessary named types to hold intermediate values. Ian’s code is shown below

C# 3.0 Code

// calculate vote for each outgoing url
var all_links = from item in items
                from url in item.OutgoingLinks.Keys 
                group item by url into itemUrlGroup
                select new
                {
                  Url=itemUrlGroup.Key,
                  Votes=from item in itemUrlGroup
                        select new
                        {
                          Weight=voteFunc(item),
                          Item=item,
                          FeedTitle=feedTitle
                        }
                };

// tally the votes
var weighted_links = from link_n_votes in all_links
                     select new
                     {
                       Url=link_n_votes.Url,
                       Score=(from vote in link_n_votes.Votes
                              group vote by vote.FeedTitle into feed
                              select feed.Min(vote => vote.Weight)
                             ).Sum()
                     } into weighted_link
                     orderby weighted_link.Score descending
                     select weighted_link;

As you can see, Ian’s code performs the same task as the Python code does but with a completely different approach. The anonymous types are performing the same function as the Python tuples did in my previous code sample and there is no need to create RankedLink and Vote types to hold these intermediate values.

What I find interesting about this is that even though I’ve been using C# for the past five or six years, I feel like I have to relearn the language from scratch to fully understand or be able to take advantage the LINQ features. Perhaps a few stints as a SQL developer may be necessary as well?  

 


 

Saturday, 05 January 2008 17:57:33 (GMT Standard Time, UTC+00:00)
At first one could probably ask why Ian's code is better than you original one. One thing comes to mind: PLINQ. If you extend the thing a bit to also put the reading of files into the LINQ query, and then combine this with the PLINQ CTP you get something so powerful it is hard to believe. Unless I am mistaken, all you have to do then is add somethin like one ".AsParallel" (it is called differently, just can't remember it right now) to the query, and the whole thing is going to use all available cores of your CPU automatically. If you think how difficult and error prone it would have been to parallelize your original code, I can see a real killer argument for LINQ there.
davidacoder
Saturday, 05 January 2008 18:32:01 (GMT Standard Time, UTC+00:00)
"[...] However it turns out I was using anonymous types incorrectly. I tried to take a feature that was meant to be coupled with C# 3.0’s declarative Language Integrated Query (LINQ) and use it in the traditional imperative loop constructs I’ve been familiar with since my days programming in C. [...] What I find interesting about this is that even though I’ve been using C# for the past five or six years, I feel like I have to relearn the language from scratch to fully understand or be able to take advantage the LINQ features. Perhaps a few stints as a SQL developer may be necessary as well? [...]"


Interesting how things become more and more complex and integrated with each other (with the excuse of becoming simpler, btw - see the schizophrenia in the market here?) and you always need to know more and more things, instead...

Interestingly enough this happens in all fields, and it is not just hitting on developers but also on "system" guys...

Saturday, 05 January 2008 20:51:15 (GMT Standard Time, UTC+00:00)
off topic, but you can simplify your python even more...

if all_links.get(url) is None:
all_links[url] = []
all_links.get(url).append(vote)

becomes :

all_links.setdefault(url, []).append(vote)
Pete Cable
Sunday, 06 January 2008 08:00:56 (GMT Standard Time, UTC+00:00)
Just looking at the code from a distance I think C#/LINQ beats python at it's own game at being more beautyful..

I Agree linq is something you'll have to learn but it is VERY sexy!
juulepuul
Sunday, 06 January 2008 22:27:26 (GMT Standard Time, UTC+00:00)
LINQ makes C# look more like sql and less like python. I can't read that as easily as a series of simple method calls. I also need to figure out how debugging works on LInq.
Ranji
Monday, 07 January 2008 16:56:15 (GMT Standard Time, UTC+00:00)
Minor note, this is C#/.NET 3.5, not 3.0.

3.0 is just WPF,WCF,WF, 3.5 added LINQ and all the compiler goodness.
pb
Monday, 07 January 2008 18:57:01 (GMT Standard Time, UTC+00:00)
Actually C# is at 3.0, while .NET is at 3.5 (nothing new in C# at .NET 3.0).
Monday, 07 January 2008 19:43:23 (GMT Standard Time, UTC+00:00)
"Interesting how things become more and more complex and integrated with each other (with the excuse of becoming simpler, btw - see the schizophrenia in the market here?) and you always need to know more and more things, instead...

Interestingly enough this happens in all fields, and it is not just hitting on developers but also on "system" guys..."

It's actually sublimely ridiculous.

Language designers always seem to forget, once their progeny have taken a footing into the market, that tenet numero uno is that "No general purpose language fits all circumstances....well !". For the same reasons that we have the phrase "lost in translation" when comparing catch-phrases, and jokes. Here was are at C# 3...excuse me 3.5, and our NEWEST language feature is that can now embed SQL-like syntatical sugar to perform RelAlgebra operations on in-memory data ? Unngh. Cie est progressio?

And while I'm certainly no python guru, from a sheer READABILITY point of view, I'd have to pick IPy over C#, in this case.

As for the wheels of progress, the Windows Driver Model , circa 1999/2000, was supposed to help simplify writing windows device drivers for us. Now everything is a "Foundation". Even plain ol' PDD's can't escape "progress".
Thursday, 17 January 2008 06:02:23 (GMT Standard Time, UTC+00:00)
A quibble:
"Since C# is statically typed, knowing the type’s name is a requirement for retrieving the objects from the list later unless I planned to interact with them as instances of System.Object and access their fields through reflection".

This isn't exactly true, in that the issue isn't C# being a statically typed language. F# is statically typed, but supports tuples as freely as Python (except in F#, a variable can't change its type from one tuple-type to another tuple-type). The issue is that C#'s anonymous types simply don't have the full functionality of tuples (e.g. you can't return an anonymous type from a function in C# (well, there's a trick you can do to get this functionality, but it's not worth it)), and are useful mainly for IEnumerables returned from LINQ queries.
Piper
Comments are closed.