Chip’s Technical Blog

Tech commentary of thoughts, challenges, how-to’s, and the mundane.

URL Shorteners

April 20th, 2012

So I’ve been annoyed by the increasing use of URL shorteners. What is a URL shortener you ask? Well, it’s a service, provided by an owner of a short domain name, that provides URL redirection—providing a short URL you can use with services like Twitter or Facebook, but originally conceived for handling URLs in emails to avoid problems with copying and pasting when the URL wraps onto the next line. When you follow a short URL, your browser contacts the owner’s web server, and is given a 301 or 302 error code in an HTTP response that tells the browser where to locate the correct content, which it then does in succession. Popular services include tinyurl, bit.ly, goog.le, and t.co.

So why am I annoyed by it? Mainly because I cannot see where the link is going to end up. I don’t like clicking on links I get from friends unless I know what website I’m going to end up on. This is primarily due to concerns over the potential that the link might take me to a malicious site, or a site I might find offensive. But a number of friends post links that I think I might find interesting–yet I never follow them because they are shortened URLs.

A secondary concern is the tracking done by the URL shortening company. It allows a third party company to track all references to a destination site through the posted URL, which to me seems like a loss of privacy.

Granted – for services like Twitter, which have a small upper-bound on the message size, something like this is needed—albeit only because of the somewhat arbitrary limits placed on message sizes. So what solution might I offer? For my primary concern (knowing where you’re headed before you visit the website), it might be good if browsers recognized the URLs for short URL providers, and used a HEAD request to determine the actual destination to present to users. Alternately, a browser might include a new feature for requesting a HEAD request for a link instead of a standard GET request. The HEAD request, rather than opening the URL, just asks the server for the headers the website would contain, and could be presented to the user so they know what is healthy and normal.

MaceKen: USENIX ATC 2012

March 23rd, 2012

I learned this morning that our submission to the 2012 USENIX Annual Technical Conference (ATC) has been accepted for publication. As with other paper announcements on this blog, I am merely sharing the good news, and forward-referencing an eventual post describing the paper at our research group website (http://www.macesystems.org/). Briefly, the paper is about supporting a new failure model for programming large scale distributed systems, allowing those systems to ignore crash-restart failures using our otherwise pre-existing Mace programming model. Sunghwan Yoo is the main student author on the paper, and it is done in collaboration with Terence Kelly at HP Labs, Hyoun Kyu Cho—a prior intern of his, and Steve Plite—of the IT staff of the CS department at Purdue.

Trying to be paperless

March 23rd, 2012

In the quest to be paperless, both for environmental reasons, and also for being gentler on my back, I seek to not have to carry around printouts of documents as easily accessed using my tablet, phone, or computer. However, I have increasingly grown frustrated that a major hindrance to my attempts to be productive without paper are hampered by the FAA, and its historical policy preventing any electronics from being used during takeoff and landing. Since on many of the flights I ride this is more than half of the flight time, my air travel time has been wholly unproductive unless I have paper documents to work with. I was prepared to write a post criticizing this, questioning whether any data supported the risks, and considering how a tech company like Apple or Amazon could score major points by convincing the FAA to certify their tablet devices for takeoff and landing usage. Accordingly, I was very encouraged to read in articles such as this one at Ars, that the FAA is in fact planning to take a “fresh look” at the use of certain electronics devices during takeoff and landing.

If such efforts fail, I next wonder what tech can produce – some ability to create a display before takeoff that can be “frozen” during a “powered-off” state, that might be large enough to keep me busy reading during takeoff and landing…

Phone self-repair

January 5th, 2012

I just got done replacing my iPhone screen, which I shattered over the break. The part was $30 on Amazon, as compared to a $130 third party repair, or probably a $200-300 Apple/AT&T repair. Yes, I am feeling pretty satisfied right now. :-)

There are plenty of good videos and resources out there, so I won’t add my own, but would just like to say—yes, it can be done!

NSDI 2012 Paper Accepted: Distalyzer

December 14th, 2011

So I learned last night that our submission to NSDI was accepted. It described the methodology for a tool we built which we can Distalyzer. Distalyzer works to help developers diagnose performance problems in their systems. It works by utilizing a minimal amount of structure from the logs, and then doing two kinds of analysis (t-tests and dependency networks) to discover the most relevant and most divergent aspects of groups of log instances. We have used it to find bugs in Transmission and H-Base, and applied it to other another system’s problem (TritonSort) to show reductions in effort needed to diagnose the problem.

Soon there will be a blog post at our research group website which describes it in more detail, but I wanted to go ahead and post about the good news.

Recent Security Papers

December 9th, 2011

So in research progress, we’ve recently published or had accepted two conference papers in the area of distributed system security. The first is a paper called “Removing the Blinders”, with co-authors David Zage and Cristina Nita-Rotaru. The basic insight of the paper is that in many protocols, nodes make decisions about other nodes based on just the last message they got from them. This is a kind of “blinders”, hiding other information the node has about the other nodes, which prevents them from making smart decisions about the peers based on the holistic information available.

However, the effort required in the first paper is totally manual. Discovering the set of attacks, and then finding the defenses for those attacks is takes a smart person thinking about it for a long time. We next set out to solve part of the problem – discovering the attacks. We focused on a restricted set of systems—those implemented in a structured language such as Mace. By applying a greedy state space exploration search strategy, we can discover a class of attacks that cause poor performance in systems. This work was accepted to NDSS 2012, about a tool we call Gatling.

Meanwhile, part of our current research involves further generalizing this work.

A real-foods diet

December 9th, 2011

So over the last several months, I’ve been working at losing weight. So far, I’ve lost about 26 pounds. My diet?

  • Less Soda. I started out by cutting back to 1 every other day. Now it’s more like 1 every once in a while. (i.e. 1-ish a week). If you figure I was having 1-3 in a day, that’s basically a drop of 150-450 calories a day, without replacement. In the place of the Dr Pepper, I’ve been mostly drinking water. I’ve also cut out my morning juice, for the most part, after being convinced that juice provides a lot of sugar/calories and not a lot of nutrition.
  • Eating products that are less processed, and more whole. This means reading ingredient lists. If there are a bunch of ingredients I don’t recognize, probably not a good sign. Most recently, this meant buying regular, all-natural sour cream rather than the fat free sour cream. When I stopped to look at the ingredient list of the sour cream, it had many, many ingredients as compared to the regular sour cream, including cellulose, which is basically the same thing as paper or wood pulp as I understand it. We just ate stroganoff made from the regular sour cream, and WOW was it good! So basically, while I’m losing weight, I’m eating richer, tastier foods, and feeling less hungry.
  • Accepting that my body had no clue what full and hungry meant. This can be attributed to many things, including the over-eating I was doing and the highly processed foods, especially the “light” foods I was eating which were essentially training my body that food did not correlate with calories. There were nights that I would make dinner portions for Kristina and I, and we would be eating, and she would remark how she was full and should stop. I would have been quite content to keep eating, but would follow her lead, and assume that what I had eaten was enough to fill me, so I would stop. I’m still not quite there, but I have started to be able to recognize sometimes in the evenings when I’m not hungry so that I opt not to snack because I’m not hungry rather than just the fact that intellectually I know I don’t need more food (which is still more common).

On the down side, one of the things we did as part of this movement was to visit our local dairy and buy a set of local cheeses. Tasty, yes. (Not necessarily more whole than what we could get at the store, but buying fresh/local is still something I’ve been doing as part of this general movement). Anyway – this morning we learn that the cheese we ate/bought while there is subject to a voluntary recall due to possible bacterial containment with a 3-70 day incubation period. Oh well.

One resource I have really enjoyed using for this is Fooducate. I enjoy both following their blog, and using their iPhone app to scan products to find out tidbits of things I ought to know.

Since I’ve been doing so much reading about food, I find I may post more about it too, so I’ve created a category for food.

HPDC Paper (InContext: Simple Parallelism for Distributed Applications)

June 13th, 2011

This past week, one of my students presented his first paper at HPDC. There is a more detailed blog entry at the research website by the student, but I wanted to mention it here too. (Post: http://www.macesystems.org/2011/06/incontext-simple-parallelism-for-distributed-applications/)

The very short story: the Mace toolkit has scalability issues since events must run atomically (think a big lock protecting events to run only one at a time). This paper describes the first step towards loosening that restriction, and running different events in parallel as long as they are not both trying to write to global state.

FSE Paper (Finding Latent Performance Bugs in Systems Implementations)

May 2nd, 2011

This post was promised some time ago, about our paper published at the conference on Foundations of Software Engineering (2010), a top conference in software engineering. Instead of posting it here, however, Karthik (one of my student co-authors), posted a description of our FSE paper here: http://www.macesystems.org/2011/04/finding-latent-performance-bugs-in-systems-implementations-fse-2010/

The very short description: by building robust systems, we hide some of our correctness bugs, converting them into performance problems. Our paper is about using model checking concepts to discover such bugs with a minimum of developer effort. See the post and paper on the group website.

Public WiFi: should you use a VPN if you only use HTTPS sites?

April 23rd, 2011

I got this question from a friend, so thought I would post these thoughts in case they help others too.

Okay – so to VPN or not to VPN on a public wi-fi network….

I guess, in the end, it all comes down to the security concerns you have.

Before discussing details, I’ll start by saying that I do not often personally connect to a VPN when using a public WiFi network, despite having one Purdue hosts that I could use.

The technical difference between VPN and HTTPS comes down to the layer of the network stack where the encryption takes place. A VPN would encrypt all traffic leaving your machine, but moreover, would direct it all to your VPN provider (your desktop, as the article suggests). Once it reaches your desktop, it will travel over the desktop’s normal network path to the rest of the internet. HTTPS, on the other hand, is
applied to a specific and single network connection between your mobile device and a given server.

So, considering only traffic to HTTPS sites, let’s look at what information is leaked.

  • With the VPN, all traffic is destined for your desktop. On the one hand, this is good, because no one can tell what sites/services you
    are using. None of your network traffic, except that which was to setup the VPN, is readable on the public network. There are, however,
    two kinds of things which are leaked. (1) the volume and pattern of traffic you use. [There is no solution for this. But you should be
    aware that it is viewable to all, and there may be profiling techniques which can be applied to learn things based on this.] (2) the fact that you have a connection with your VPN provider. From a privacy standpoint, this in fact may be a very serious concern, because if you are using your desktop as your VPN, then it may very precisely identify who you are, where you live (see an article today in Ars Technica on mapping based on an IP, etc.
  • With HTTPS, only the web traffic to the given server(s) is encrypted. In particular, other information is leaked. (1) The IP addresses of all the sites you connect to, which may identify who you bank with, who you work for, who your email provider is, etc. (2) The DNS queries you issue, which would make it even easier to identify what sites you are visiting, without having to reverse-map an IP-to-hostname, when the IP may have multiple hostnames. (3) More precise information about your traffic patterns, since it is subdivided by destination rather than being aggregated in the VPN case. (4) Some HTTPS sites will include static content or images from a non-encrypted source (some browsers warn about such things). This information of course would also be unencrypted.

Next, consider the other traffic your mobile device may be sending. For example, if it participates in any convenience networks (i.e. Bonjour
for peer host discovery), this traffic will all be present too, and may or may not be encrypted, based on the service.

Another consideration is the exposure to attack your device has. In both cases, your device is connected to the wireless network. However,
in the VPN case, the default settings of the device may be generally more secure, since the wireless network wouldn’t need to support some of
the extra traffic. It becomes harder to launch an attack, since the machine is mostly looking for traffic from the VPN, and will ignore most
local traffic. HTTPS leaves any such services (e.g. iTunes listening for connections from the Remote.app on the iPhone) listening.

Finally, there is the cost. VPN adds an extra layer of overhead, and an extra layer of places where things can go wrong. Also, all traffic is
going through your desktop, which may significantly reduce the bandwidth you can achieve, and add latency. (And of course, an HTTPS site when using a VPN is being encrypted twice – once at the HTTPS layer, and once at the VPN layer). Further, the choice of a VPN vs HTTPS may have other unpredictable effects – a wireless network provider may block VPN traffic, or possibly deprioritize it. Or they might do the same for HTTPS traffic (though deprioritizing is more likely than blocking in this case).

Okay, one more consideration – which is the quality of the encryption. Both technologies can provide a range of encryption quality, so
vigilance must be used in ensuring effective encryption is used. Some browsers will warn about weak SSL configurations on servers, but VPN
encryption quality is generally less well verified.

Hope this helps,
Chip

Chip’s Technical Blog is proudly powered by WordPress
Entries (RSS) and Comments (RSS).