Skip to main content

Posts

Showing posts from 2014

Randomized algorithms: Verifying matrix multiplication

Another great example of randomized algorithms in the introductory chapters of the book " Probability and Computing " is verifying matrix multiplication. Suppose we have 3 matrices of compatible dimensions $A, B$, and $C$, and we want to verify that \[A\cdot B = C\] For simplicity, let's assume that all the matrices are square, and of dimension $n \times n$. The straightforward way to do the verification is to explicitly multiple the matrices $A$ and $B$ together, an operation that is of the order of magnitude of $O(n^3)$. As an aside, we can do better than $O(n^3)$ for matrix multiplication for larger matrices. Wikipedia has an excellent writeup on  Strassen's algorithm  which accomplishes matrix multiplication in $O(n^{2.8074})$ through divide and conquer. First the matrices $A$, $B$, and $C$ are padded with zero columns and rows so that their dimensions are of the form $2^m \times 2^m$. The matrices are then divided into equally sized  block matrices of the f

Classic Algorithms: Page Rank and the importance of web pages

Think of a search query, and go to your favorite search engine and conduct the search. Of the millions of matching documents returned, it is most likely that the first page or two contain what you are looking for. Have you ever wondered how the search engines know how to present the most relevant results for us in the first couple of pages, when all the results returned match the query we were looking for? There are a lot of signals search engines use to determine the importance of the matched results: some are intrinsic to the page such as where in the page the match occurs, whether it is highlighted or not, and the importance of the web page; and some are extrinsic such as how many users clicked on the result for a similar query. The signals, their weights, and the formulas to rank the results are usually guarded secrets by search engines as they are the secret sauce for giving users the best results to their queries. One such signal is the importance of a web page. How would yo

Randomized Algorithms: Polynomial equivalence

We are all accustomed to deterministic algorithms; we work with them every day, and feel comfortable in knowing that the results of running them are predictable, barring a coding error of course. The idea of randomized algorithms feels remote and uncomfortable, despite their usefulness and elegance.  There are a couple of great examples in the introductory chapters of the book " Probability and Computing " that are an eye opener. One is verifying polynomial identities: how can you tell that two different representations of polynomials are the same? For example, if we have two polynomials $P(x)$ and $Q(x)$, both of degree $d$ described by the following formulas: \[ P(x) = \Sigma_{i=0}^{i=d} a_i x^i \\ Q(x) = \Pi_{i=1}^{i=d} (x-b_i) \] how can we determine that they are the same polynomial? Intuitively we first check that the degrees are the same, then we could try to transform one form into the other, either by multiplying out the terms for $Q(x)$, collecting like t

Down memory lane: Approximate counting

At work we do a lot of counting. For some counts we need an accurate result, while for others we can get by with an approximate count. For these approximate counts, as long as the result is within the same order of magnitude of the true count, we're ok. There is a lot of literature on approximate counting techniques, and the blog post  "Probabilistic Data Structures for Web Analytics and Data Mining"  does a great job at explaining some of them in detail, with references to the original papers. One of the original approximate counting papers was the 1978 paper: Counting large numbers of events in small registers by Robert Morris. In the paper Morris explains the  context for the problem, which might seem foreign today. Back in 1978 he was faced with the problem of counting a large number of different events, and because of memory restrictions he could only use 8-bit registers for each counter. Since he was not interested in exact counts, but with accurate enough c

React conference in San Francisco 2014

Last week I attended the React conference in San Francisco . The conference is a two day event where speakers share their experiences on building reactive systems: ones that are resilient, elastic, responsive, and message driven. The reactive manifesto web page   has more detailed information about reactive systems, and why they are useful. This year the conference was at Cobb's Comedy Club , a cozy venue for the roughly 300 people that attended the conference. Because of the tight space, power plugs were non existent, but the organizers were extremely thoughtful and provided every attendee with a rechargeable battery with iPhone and Android connectors. The sessions in the conference were great, but a couple stood out. The first was Netflix's presentation "Resilient by Design", where the speaker talked about how Netflix designs and deploys their services: from using microservices that do one thing and do it very well with well defined interfaces, to cloud servic

Mining the Social Web, by Mathew Russell, O'Reilly Media

"Mining the social web" is a book about how to access social data from the most popular social services today by using the services' public APIs, and analyzing the retrieved data to gain insights about it. The book uses the Python programming language to access and manipulate the data, and provides code snippets of common tasks within the book, as well as full iPython notebooks on Github. The book is written as documentation for the freely available iPython notebooks, with the documentation providing context and background for the code, as well as describing the algorithms used to mine the social data. The author tries to be as concise as possible, although he did not succeed in the first chapter, where the first three section were verbose, and relatively unnecessary,  describing what twitter is and why people use it as a microblogging platform. With that out of the way, the writing style improves as the book progresses, and is a mixture of code examples and step

Can you make me a Cortado please?

A couple of months ago I stumbled upon an  info graphic that depicts popular coffee drinks around the world , and thought I'd give some of these drinks a try. I started with the cortado: a drink that is popular in Spain, Portugal, and Columbia, and consists of one part espresso, and one part steamed milk. Almost every coffee shop that I went to had no idea how to make the drink, and it became a great conversation starter with the coffee barista, describing where I stumbled upon the recipe, and what other coffees are popular in different regions of the world.  I was pleasantly surprised when one barista at Peet's coffee knew how to make the drink from his travels to Spain. He also suggested modifications to the drink that would make it more delicious including using whole milk instead of 2%, adding another shot of espresso, and sweetening the drink with one pack of brown sugar. The final combination is my current favorite.

Migrating from the Macbook Air to the Macbook Pro

My Macbook Air started to show wear and tear after a couple of years of heavy use. Five of the keys on the keyboard broke and had to be replaced, despite my light touch typing--honestly, and the battery has moved from the warning that it needs service to retaining electric charge for shorter and shorter periods of time, to not holding a charge at all. For my next device I contemplated getting another Macbook Air with the highly enviable 12hr battery life, or switching back to a 15 inch Macbook Pro, and enjoying a more powerful machine, with a retina display and a respectable 8 hours of battery life. The prospect of more screen real-estate, and more processor power was too enticing, so I ended up getting the Macbook Pro despite the weight difference. Since I accumulated 3+ years worth of data and software on the Air, I did not want to repeat the process of reinstalling apps from scratch and searched for an easy way to migrate the data to the new machine.  Most of the online recomme

Fitbit bands

I am fairly disappointed in the quality of the Fitbit flex bands. After a couple of months of moderate use, the bands developed deep cracks and finally broke. There are a lot of accounts online and on Facebook from Fitbit users who have experienced similar issues, so the problem is widespread and not isolated. What makes matters worse, is that after ordering and getting a replacement band from Fitbit, a rash developed on my skin. I did not experience a rash with the prior band color, and my guess is that the material is different from the original one I had. Now I have two choices, either get another replacement band with the original color, and hope that it does not cause a rash, or give up on the Fitbit completely, and wait for some of the newer health tracking technologies around the corner. An iWatch perhaps?

First impressions of the Fitbit Flex

Many years ago I bought a pedometer to help me count the number of steps that I take on a daily basis. The idea was to walk at least ten thousand steps every day to balance the modern sedentary lifestyle, caused by sitting on a chair almost all day long, both at work and at home. The pedometer was great for a while, showing how far I was from the ideal number of steps, and making me compete with myself every day to increase the number of steps I take. Because it was bulky and inconvenient to carry in my pocket, when I finally hit the goal of ten thousand steps I stopped carrying it along. And of course the number of steps declined back to where it was before the pedometer, and I did not think anything of it. This year the flu season has not been kind to me. Between a hectic work schedule and a kid at daycare, I have been hit by some cold or flu variant every couple of weeks. My wife attributed the low immunity to a lack of physical activity, and she started looking for a way to help

Juice diets and the Vitamix

Over the weekend I watched "Fat, sick, and nearly dead", a documentary on Netflix about how a sick and overweight Australian regained his health by following a strict juice diet for about 6 months augmented by moderate exercise. The documentary tracks his progress over the 6 months period, as he travels throughout the US from coast to coast, talking to people about the health benefits of the juice diet, and showing them how it affected his health. The documentary also shows how toward the end of his diet, he started evangelizing it, and helping other people reverse their heavily medicated health problems by following a similar diet and moderate exercise. The documentary was very well made, and the author is quite likable. It was also very educational about the types of foods people consume, and the nutrients they get out of them: micro-nutrients--those that come from plant sources, and macro-nutrients--those that come from animal sources. The documentary's thesis is th

Sonos Play:5

Over the holiday season I was looking for a speaker with a decent sound quality, and good support for playing music wirelessly. With a bit of research I settled on the Sonos Play:5 speaker. Most of the online reviews were favorable, and it supported streaming online music from Spotify, Pandora, and a dozen other services. The speakers were also on display at Target stores for a live test, and despite the background noise in the store, they sounded great. Sonos was running a promotion at the time, where you got a free bridge with the purchase of any of their speakers, which is a decent discount. Setup was extremely easy. I plugged the bridge into my wifi router, plugged the speaker into a power outlet, and installed the Sonos application on my computer. The application guided me through the intuitive setup, which involved pressing a button on the bridge, and another on the speaker to connect the whole system together.  After that it was time to play music. The Sonos application imp