Downloading PDC10 videos using the new async feature


I knew PDC10 has an OData endpoint which is http://odata.microsoftpdc.com/ODataSchedule.svc/ . The best part about  OData is querying for specific data that we are looking for. And here is my OData url for filtering twitter hashtag #languages


http://odata.microsoftpdc.com/ODataSchedule.svc/Sessions()?$filter=startswith(TwitterHashtag,'%23languages')&$expand=DownloadableContent&$select=DownloadableContent

With the above OData feed I could get urls for low bandwidth mp4’s that I can download. And here is the sample code for filtering


var x =XDocument.Load(@"c:\temp\session.xml").Descendants().AsParallel().Where(xd => xd.Name.LocalName=="Url"
&& xd.Value.Contains("_Low.mp4")).Select (xd => xd.Value);

Now that I have the url’s ,here is the code to download the videos using the new async feature

using System;
using System.IO;
using System.Linq;
using System.Net;
using System.Threading.Tasks;
using System.Xml.Linq;

namespace Test
{
 class Foo
 {
 static void Main(string[] args)
 {
 DownloadAsync();
 Console.Read();
 }
 static async void DownloadAsync()
 {
 var result = new WebClient().DownloadStringTaskAsync("http://odata.microsoftpdc.com/ODataSchedule.svc/Sessions()?$filter=startswith(TwitterHashtag,'%23languages')&$expand=DownloadableContent&$select=DownloadableContent");
 var downloads = XDocument.Parse(await result).Descendants().AsParallel().
 Where(xd => xd.Name.LocalName == "Url" && xd.Value.Contains("_Low.mp4")).
 Select(xd => new WebClient().DownloadFileTaskAsync(xd.Value, Path.GetFileName(xd.Value)));
 await TaskEx.WhenAll(downloads).ContinueWith(_ => Console.WriteLine("Downloading Complete"));
 }
 }
}

Combining Stack Overflow RSS, OData and API to query


In my opinion Stack Overflow has a ton of knowledge to learn new tricks. And there are some really smart people in the SO community. I try and learn new things when I find time.

I subscribe to RSS feeds for new questions on a particular topic. Example, here is one for F# from Stack Overflow http://stackoverflow.com/feeds/tag/f%23. The advantage of the RSS feed is I get to see new questions, but the drawback is I would have to navigate to the site to look for answers. AFAIK the stacky (stack overflow API) does not provide a mechanism for querying new questions based on a tag.

It was easy for me to combine both of them to solve my problem. With RSS feed I could discover new questions and with the stacky I could get answers . And I use Linqpad as a scratchpad so it was easy to write-up something quick.


void Main()
{
 var reader = XmlReader.Create("http://stackoverflow.com/feeds/tag/f%23");
 var feed = SyndicationFeed.Load<SyndicationFeed>(reader);
 var length = "http://stackoverflow.com/questions/".Length;

 var client = new StackyClient("1.0", File.ReadAllText(@"c:\temp\so.txt"),HostSite.StackOverflow,new UrlClient(), new JsonProtocol());

 var feedItems = from item in feed.Items
                 let nextOccurence = item.Id.ToString().IndexOf("/",length)
                 let getId = new Func<int>(() => Convert.ToInt32( item.Id.Substring(length,nextOccurence - length)))
                 select new {Id = getId(), Title = item.Title.Text, Body = item.Summary.Text.StripHTML()};

 var answers = client.GetQuestionAnswers(feedItems.Select (y => y.Id),new AnswerOptions() { IncludeBody = true});

 // The latest F# feed questions and answers
 var qa = from question in feedItems
          join answer in answers on question.Id equals answer.QuestionId
          where answer.Accepted == true
          select new { Title = question.Title, Question = question.Body.StripHTML(), Answer = answer.Body.StripHTML()};
 qa.Dump();
}
public static class Extensions
{
      public static string StripHTML(this string s)
      {
         return Regex.Replace(s, @"<(.|\n)*?>", string.Empty);
      }
}

And if you have been following F# and functional programming then you would probably know Tomas. I would also like to read what he has been answering. Again stacky does not provide an API to query user by name. This is where the SO OData comes in handy and LinqPad handles OData very well. Here is the code to get Tomas user id via OData and query for questions and answers which he has answered using stacky .

var tomas = Users.Where(u => u.DisplayName.StartsWith("Tomas Pet")).First().Id;
var tomasQA = from ans in  client.GetUsersAnswers(tomas,new AnswerOptions() { IncludeBody = true })
              select new { Title = ans.Title, Question = client.GetQuestion(ans.QuestionId,true,false).Body.StripHTML(),
              Answer = ans.Body.StripHTML()};
tomasQA.Dump();

Using Tech-Ed OData to download videos


I wanted to watch the Teched 2010 videos, but the problem I had was going to the site manually to download files for offline viewing.  And I was also interested only in Dev sessions which were level 300 / 400. Thanks to OData for teched http://odata.msteched.com/sessions.svc/ ,I  could write 3 statements in linqpad and had them all downloaded using wget

File.Delete(@"C:\temp\download.txt");

Sessions
.Where (s => (s.Level.StartsWith("400") ||  s.Level.StartsWith("300") ) && s.Code.StartsWith("DEV"))
.Take(10)
.ToList()
.Select (s => @"http://ecn.channel9.msdn.com/o9/te/NorthAmerica/2010/mp4/" + s.Code + ".mp4" )
.Run(s => File.AppendAllText(@"C:\temp\download.txt",s + Environment.NewLine));

Util.Cmd(@"wget.exe -b -i c:\Temp\download.txt",true);

Forgot to mention for the Run extension method is from Reactive Extensions

Using LINQ and Reactive Extensions to overcome limitations in OData query operator


I was pleased to know that Netflix had OData API to query. The practical reason is obviously was to use the API to query for the movies I want to watch. Like I mentioned in my previous post, I will be using LINQPad 4 for querying purposes, because of its built-in capabilities for OData as well as for Rx.

One thing I discovered after playing around with OData is that not every query operator in LINQ is available in OData. For example the Netflix API has only for 4 operators which are

  1. Filter
  2. Skip
  3. Take
  4. Orderby

And also the query returns only 20 rows as the result for each request. So for example if I have to get 40 rows, on my first request  the server would return 20 rows and in my next request I would have to skip first 20 and take next 20 to get 40 rows. These are some of the limitations.

Here is what I wanted from Netflix, I wanted to movie listings that has an average rating greater than 3.5 ,ordered by release year descending and grouped by listings that are available for instant watch.  So that I can have one queue for movies that I want to watch online and another one that I can request via mail (the ones that is not available in instant watch).  And here is the query to do that


 var movies = from counter in (from e in Enumerable.Range(0,400) where e%20  == 0 select e).ToObservable()
 from movieTitle in Titles.Where (t => t.AverageRating > 3.5).OrderByDescending (t => t.ReleaseYear).Skip(counter).Take(20).ToObservable()
 select movieTitle;

var moviesILikeToWatch = from counter in movies
 group counter by counter.Instant.Available into g
 select g;
moviesILikeToWatch.Dump();

The first “from counter” query is to build the skip part, like I mentioned by default the  result returns only 20 rows I wanted 400 rows to achieve that I used the enumerable range to generate sequence that I can use for skipping in my next query. I could have very well used for loop to build this, but that is not what I want. I want to try and write terse code. These are actual calls to Netflix OData  API

http://odata.netflix.com/Catalog/Titles()?$filter=AverageRating gt 3.5&$orderby=ReleaseYear desc&$skip=0&$top=20
http://odata.netflix.com/Catalog/Titles()?$filter=AverageRating gt 3.5&$orderby=ReleaseYear desc&$skip=20&$top=20

In the below picture linqpad makes 20 calls to Netflix for getting 400 movie listings

The next line in the first  query “from movieTitle” is simple Linq query to get movies based on filter criteria along with skip and take. The reason for the second query is because the OData  API doesn’t provide a groupby operator and if I include it in my first query , Linqpad would try and convert it to OData specific  request which would fail. So essentially I am getting all the data from the server and then grouping it locally.

This wouldn’t have been possible without OData.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: