Privacy and Google

I have been a long-time fan of Google, and I generally find their products convenient, high-quality, useful, and innovative. Unfortunately, with all that useful product has come an ever-decreasing amount of privacy. It is now possible to do much of your daily computing work with entirely Google products — email, web search, documents, maps, chat, voice & video chat, youtube videos, picasa images, and now even DNS. Beyond all of this content which users freely give to Google, their analytics allow them to track you even further as you venture into non-Google sites, since these sites often use Google to provide ads to their sites, which allows Google to see what other sites you visit.

Recently, Eric Schmidt, CEO of Google, responded to questions in a televised interview which suggested a lack of concern for the preservation of privacy (some posts about it link to the video of the interview mozilla ceo’s blog urging switch to bing notes from the electronic fontier foundation summary of various posters). I am a little torn about how to react to this. On the one hand, I admit that the statement is factual — Schmidt is telling people that Google and other companies store this information, it is all information made available to Google, and Google is legally obligated in some cases to make this information available to authorities. On the other hand, I think it shows a complete disregard for the importance of user privacy, and the need for companies who are given such information to be good stewards of that information. Simply hiding behind the Patriot Act as a way to defend your practice of gathering, correlating, and then divulging information as desired may be legally sound, but does not reflect the attitude I wish the company gathering information to be taking.

Consider this: if you use Google DNS, yes, you may get some performance benefit over the DNS provider from your ISP. But, by telling your computer to use Google’s DNS servers, you will effectively be sharing the names of all websites you visit with Google. And don’t think for a minute that their automated mechanisms won’t correlate this with both the searches you are using to Google and the data of all Google accounts you maintain. (Note that a naive objection here is that you otherwise give this to your ISP. But your ISP already is transferring all of your network communications, so that information will be available to your ISP whether or not you send them your DNS queries).

From Schmidt’s point of view – you should already be aware of what information you are giving to Google, and the fact that Google may pass it along to others. But are you? Google could pick things out of emails you send or receive, match that up with the GPS data from your cell phone that you pass to Google so it can place you on the map, the searches you conduct on your phone, and the hostnames of sites that you visit, and build a very precise picture of what you’re doing. Granted, certain things in that profile might have been misinterpreted, and that profile could make it look like you’re doing something very embarrassing.

You see, it’s not the fact that Google has or is collecting this information per se that’s bothering me, but rather that when making this statement, Schmidt doesn’t say “Yes, all of this is true, BUT here are the steps Google is taking to safeguard users…” No, Google’s business is enhanced precisely because they can build these profiles of users.

Google’s long standing reputation of following their motto of don’t be evil is eroding. Thus far, I wouldn’t necessarily say they are being evil. But I no longer feel I can endorse them whole-heartedly as I used to. You should think seriously about what tools and products of Google’s you are using, and what kind of information you are giving them, and whether or not you really want a company who does not proclaim their desire to defend your privacy to have all that information about you. Does the cost outweigh the benefit of using their products?

Time for new grammar checking

Subtitle: protecting the language

So I am often bad with sayings and grammar; as a result, I am fascinated by the subject. There are a number of these which I have often misused, some of which include:

  • I’ve used “here here” when I should have used “hear hear”. See this post for details.
  • I’ve said “intensive purposes” when I should have said “intents and purposes”. I had even worked out a definition, including how they differed from “extensive purposes”.
  • I’ve said “mute point” when it should be “moot point”. See here for details.
  • I also mix up a variety of sayings, such as the nonsensical “double-bladed edge”.
  • And the list goes on…

In addition to the language rules blog linked above, another great place to learn about such things is A Way With Words, broadcast on many local NPR stations.

Unfortunately, I see these mistakes as muddying up our language, so I would like to get better about it. But rather than
going out and searching down each of these cases to learn independently, I feel like there is a better solution. As a computer scientist, I recognize that we already have the perfect mechanism for this — the grammar checkers. Our word processing software already has a grammar checkers, and our web browsers have spell checking, so it may just be a matter of time. I think the grammar checker should be adapted to look for misuses of the language, and to suggest alternatives to writers. This suggestion should come complete with internet links to learn more about the cases found. So in the future, when I “tow the line”, it can let me know that I should instead “toe the line”. So short of listing all commonly mistaken sayings, how can we build software to do this? That is my question. But in the meanwhile, I would just settle for a [possibly community maintained] database of common mistakes it can check for.

"Pending Obituaries"

So at the Journal and Courier online (thanks for the recent feature as blog of the week!), they run a daily story about “pending obituaries”. For examples, just do a quick search.

I, for one, find this practice very amusing. It is particularly amusing when it shows up in the optimized page for hand-held devices as “breaking news.” I mean, judging on the title alone, it seems like this should refer to people who are about to die, but who haven’t yet. Something like “John Doe, who was hit by a truck this morning, is in critical condition at the hospital. He is expected to pass within the hour.” Instead, it seems to be brief announcements of the completion of life, but without the full details. (These articles inevitably contain a sentence at the end referring the reader to the next day’s edition of the J&C.)

In the future, may I suggest (to no-one in particular, since I’m not making a point of sending this to the paper) that we find a more appropriate name for these articles? Though I have to say obituaries in progress or partial obituaries aren’t any better. Maybe we should change the word obituary, since it tends to suggest that it should have full details. Perhaps we could call them death notices instead? Then refer readers to the full obituary in the print edition. Or, maybe just leave it as it is. Maybe we need a sense of humor about the passing of our lives.

The "No Camera" Rule

Tonight we were at the LeAnn Rimes concert, in which there was a posted no-cameras sign at the entrance gate. As you will see later when I post 2-3 of my own, you might expect that this rule is not well enforced.

So what’s the problem? Of course, the problem is that you can’t buy a cell-phone anymore without a camera, just about. Thus, unless you are going to either check every cell phone, or not allow them at all, you aren’t going to keep cameras out of the concert. (Of course, these cameras are also small, so unless you plan to use a metal detector, you probably won’t notice the cameras [cell phones] anyway.)

Thus, it follows that there were a LOT of people taking pictures at the concert with their cell phones. Throughout the concert they would walk in front of the first row (between it and the stage) and pause long enough to snap a picture. This of course was very annoying for those of us in the first few rows. There was also no attempt on the part of security to prevent or curb this activity. After all, what are you going to do, short of making people cross the venue at the back?

So accept as a given that people will have cell-phone cameras. It no longer makes any sense to prevent the use of the vast majority of consumer-grade cameras, which are only marginally better than the current generation of cell-phone cameras. And not surprisingly, there were people using those as well. Oh, and the bigger ones too—no, not so big that they were bigger than someone’s head, but still quite big. People were also not shy about it as you might expect, quickly snapping the photo and then hiding the camera so as to pretend they didn’t take a photo. No, they would walk right to the front with their quite-obvious-camera, and take a picture, complete with flash. Oh heck, why stop at just one. Get another one while we’re up here, in case the first doesn’t turn out.

The woman seated just in front of and to the side got a bunch of pictures — many quite good (I know, because it was impossible to avoid watching her LCD screen as she setup the shot, took it, and then checked its quality. It was, after all, being held over the level of all our heads while she did so to avoid anyone in the audience being part of the photo).

So, I think the time may have come to abandon the no-cameras rule, since it is so clearly not actually applied. Instead, we should be thinking of ways to make the cameras less obtrusive during the show. Perhaps have a place people can go to shoot their shots which is out of the way of the main audience. Perhaps have a song break where you tell everyone in the audience to get their photos out of the way now, and then ask them to put the cameras away and enjoy the concert. Coming from the artist themselves, seems more likely to be heeded anyway. Perhaps tell the audience they can take photos but to NOT use a flash. Some of them still will, but if you couple it with a reasonable explanation of why you shouldn’t use a flash, I think many people will respect it. Also, have a big sign posted indicating that shooting photos is acceptable for personal use, which still lets you crack down on those trying to make a buck off their concert shots, while allowing those people who you really can’t stop anyway shooting a photo to post to Facebook.

After all, those Facebook photos are probably doing more to promote and benefit than they are to harm.

[Review: it was a good show, and the third-row center seats were excellent.]

A plea to TV programmers

My wife and I have recently been discussing the idea of canceling our cable TV. There are a variety of reasons for doing so, which include these circumstances:

  • Cable TV is trying to push digital cable by removing channels from analog cable.
  • We see no present value in the additional costs of digital cable. In fact, we see no value in getting bigger, sharper, TVs, as we feel the picture is just fine, and sufficiently large to see from our sofas a mere 8 feet away. So it’s not about quality. And it’s not about quantity either — the additional cable channels using a digital box are largely in three categories: replicas of channels available in analog cable, additional-fee channels, and music-channels. Of these three, the only ones we ever use are the music channels.
  • We feel that a large portion of content created today is not worth watching. Our viewing preferences have actually narrowed somewhat — there are only two channels we watch with any regularity outside the broadcast channels. Yet, our flat fee paid to cable companies does not adequately reward content providers for making the content that we do like.
  • More content is available online, or through direct-to-mailbox DVDs from Netflix or Blockbuster. Thus, if we don’t mind waiting a bit for content to become available in either online or DVD format, there’s no need for live broadcast anyway. Even better — when paid for by users, this content is generally commercial-interruption free and better quality than we get through the cable company anyway. I distinguish between commercial-interruption free and commercial free because as we know, the new wave is in product placement on shows. But at least it doesn’t contain those hideously large and non-silent network overlays from channels.

There are others talking this way as well. See this post over at Freedom-to-Tinker for a good read as well. And today, I read that cable companies want to offer exclusive channel content online to subscribers [story]. So this is my plea to programmers. Forget TV stations and network affiliations. Instead, sell your shows direct to viewers. Do it without ads (though I imagine you’ll still have product placement/endorsements), or at least have a two-tiered system where users can pay more for an ad-free program. Then, you will get a better picture of your viewers, and can probably do a better job of marketing to them. Online word-of-mouth can help your show catch on and grab followers. If you are worried about steady-income, offer us high-priced single-show samples, and more reasonable season buy-ins. I would much prefer this — so I can get just the 10-ish shows I actually watch rather than the vast array of TV programming I don’t care about.

Happy New Year! — Leap Second Update

So I realized yesterday that all the news about the New Year’s countdowns should be 3-2-1-1 were just dramatizations of reality. I guess I should have figured that all along, but got caught up in it myself. You see, the leap second was added at 23:59:59 UTC (coordinated universal time), and therefore, in my time zone, had been added to the clock 5 hours prior. There was nothing unusual about our countdown (at least, for the last 10 seconds).

But I have yet to find a reporter who actually reported what happened in Trafalgar square (or other places celebrating New Year’s with the UTC leap second in play). Did they start their countdown a second later? Or did they count some number twice? Or were they early? The public wants to know (or at least I do)!

Leap Seconds

So this year there will be a leap second added just before midnight, the first since 2005. As we all go to celebrate New Year’s, I wonder how this will be treated by the TV networks (I don’t recall how it was handled in 2005). Will we start our countdown one second later than usual, so that the 10 second countdown starts at 11:59:51? Or will we all actually celebrate the New Year one second too early? Perhaps the network will add their leap second early to avoid confusion. Will municipalities running fireworks shows start them on time? Do they launch fireworks at that second, or do they try to time it to explode at that second? Will the general public have any clue that a leap second occurred?

http://edition.cnn.com/2008/TECH/12/31/leap.second.new.year/?iref=mpstoryview

Puzzle Oops!

So we bought a 1000 piece puzzle from WalMart before Christmas for the family to put together, and last night we did so. However, there were a few problems with the puzzle. During the putting together of it, there were these pieces which we just became convinced they couldn’t fit anywhere in the puzzle. Though most of the time, we thought we were just kidding, that it would become clear eventually where they fit.

When we had put in “all” the pieces, there were three extras. There really were pieces which didn’t go in the puzzle. On further inspection, these pieces turned out to be duplicates of 3 other pieces in the puzzle. They really didn’t fit! And moreover, we had paired one of them with another and therefore couldn’t figure out where in the puzzle the two pieces went.

While it’s annoying to have been frustrated about these extra pieces, the more frustrating thing is that we were one piece short. While I cannot be 100% certain that we didn’t just lose the piece, I have to believe that since we have extra pieces, those are missing from someone else’s copy of the puzzle.

So how does this happen? Perhaps they lay them flat, and cut many together, then whisk them off. Then in this case, two pieces may be stuck together, and the whisking process just screwed up?

So if you bought an I Spy puzzle from WalMart, check and make sure it has all the right pieces. And if you have extra ones, let me know. Maybe we’ll start a puzzle-piece exchange to fix it up.

On the other end — the question becomes, how can you do quality assurance on jigsaw puzzles? How do you avoid this kind of production problem? Maybe I’ll do some research into the mass production of jigsaw puzzles to figure out how this happens and how to prevent it.

Validating email addresses

As an early user of gmail, I was able to select precisely the username I wanted, ckillian, which is a very common username for people whose first name starts with ‘C’ and whose last name is Killian. Unfortunately, as often goes for popular shared services, there are many gmail users who fit those parameters. By itself, that wouldn’t be a problem, except that on occasion, these other gmail users seem to forget that ckillian is not their email address. They will use it to buy tickets on ticketmaster, place beach house reservations, setup ipod accounts, request proprietary recipes from companies, purchase items from websites, and, most recently, even to use it to purchase online postage from the USPS.

The “best” part is when users are so convinced it is their address that they go through Google’s password recovery system to try to get my password. This has happened 3 times so far. (I know, because Google sends me a link to my email addresses to follow if I want to proceed in resetting my password.) One truly intelligent user, after going through the password change system and failing, actually sent me an email asking if I would forward the information to her, which I was happy to do (on a temporary basis).

I recognize that for the users, this is generally an honest mistake (I get these receipts for a CXXXX Killian, and what’s obvious to me is that ckillian is there username for some other things, and they just got mixed up while entering the email address). When I get such receipts, responses, etc., if there is a phone number listed for the user, I often make the attempt to phone them to let them know of their mistake. But more often than not, there is no method shown to contact them with. In these cases, I have two options: (1) ignore it, and hope I don’t get stuck on some mailing list, or (2) contact the seller/sender and let them know they are sending information to the wrong party. Some vendors handle this well — the Apple store took care of it without hassle. Others, like the USPS, take some convincing (at first they thought I was trying to commit fraud). And then there are those like Ticketmaster, who I have simply given up on, because I can’t seem to get them to stop sending me junk even when I did setup the account (and diligently unchecked the boxes so I would not receive the junk).

This represents a fairly significant issue though, because for many of these services and sites, by going through a forgotten password dialog, I could have the password reset and emailed to my account, giving me access to their information and account, and possibly other information such as credit cards, or perhaps just the ability to purchase things using their credit cards.

And what frustrates me the most is that most of these sites have set up a kind of account based on this email address, without validating that the new user actually has access to the email address. It’s one thing if you simply mistype an email and as a result a single receipt goes to the wrong email address. It’s yet another if you are saving state for the user under this email address without validating it. Most websites form of validation is just to have the user type it twice. But we should know by now that user data cannot be trusted, and if we are going to store that kind of information, we really should validate the email address.

And it’s not even that hard to do so—mailing lists do this all the time. When you subscribe, they send you an email for you to prove you have received before allowing your subscription to proceed. All sites creating accounts should do the same. I would much rather have gotten an email from the USPS telling me to validate someone else’s account (which I would not have) than gotten a receipt for a delivery confirmation postage for a particular person.

So to those of you developing websites which create accounts for email addresses — please, please, please validate these email addresses before storing them!

Server Cookies, and I don't think they quite understand advertising…

I should start by explaining I regularly run my web browser with cookies disabled. The reason is that I decided websites are tracking you too closely, and especially websites which you didn’t even know you were visiting. For example, open up your cookie list. (In firefox, this is: Tools->Options (under Windows, Edit->Preferences under Linux), then Privacy->”Show Cookies”. The questions to ask yourself are:

  1. How many of the sites listed do I even recognize?
  2. Of the sites I do recognize, what do I want that site to remember about me the next time I visit?

Cookies, you see, are files that a server gives to a web browser, and asks it to present them whenever they visit a set of pages on a set of sites. Cookies have a number of legitimate uses, most notably to give the browser a “session” id. The “session” id is used so the browser user can, e.g., log in, and have the server remember keep track of information related to the login. (The other option, not using cookies, is to make the sessionid part of the URLs, which is both ugly, and more likely to be logged by third parties such as proxies and caches run by ISPs)

Then there are some arguably useful features of cookies. For example, many online retailers will set a cookie identifying you at your browser, and recognize you immediately when you visit again (not for purchasing, but for welcoming, tracking the products you look at, so to remind you of past products you’ve visited and to suggest new products based on your viewing history. I personally find that a little creepy, though I admit in some cases it can be valuable. A few years ago, there were even reports of sites using cookies to do Dynamic Pricing (story by CNN), a practice where sites change the prices based on information they keep about the customer. There were reports of users visiting Amazon from a new computer, finding an item they like, then logging in, and seeing it for a new price. In my opinion, these types of things outweigh the possible positive benefits from having a site remember me just for cause.

Next there are in my book some outright despicable practices. Advertisements placed on sites will add cookies which get reported back to these tracking sites anytime you visit any site with an advertisement from the same company. As a result, there are sites which simply compile vast amounts of information about where you go and what you do online, to use in any way they seem fit. These are commonly called “Tracking Cookies” by products such as Ad-aware and Spybot, which will remove the ones they recognize for you.

I have simply taken the approach (mostly as an experiment) that sites shall not store cookies without my express consent. To that end, I have installed CookieSafe, which makes it easier to manage cookie settings. I either give or reject cookies from specific sites. This occurs as a site preference, meaning if a site uses both kinds of cookies, and I want to use the site, I accept them both. Importantly, the third-party cookies are still rejected — I have to authorize them separately.

So my browsing works like this: I browse normally, then if a site isn’t working (and particularly if submitting a login doesn’t work), I realize it needed cookies to work. I then decide if I really want to use the site, and if I do, I enable cookies for that site only.

Now, when I view my list of cookies, I can identify most of the sites. (Some I must have authorized, but don’t quite recognize by site name, like the third party my bank uses to process online billpay.). I find this to be much more acceptable, and my browsing hasn’t been worse for the wear.

A few days ago, however, I saw something that really brought a smile to my face. On a site I visited while trying to figure out what it meant to buy fertile eggs, I saw this image, where an ad belongs:
“No Cookie” Advertisement

I just had to laugh. If a site wants to not send me ads because I reject cookies — then great! I didn’t want them anyway. But somehow I think they’ve missed the point of advertising. If I were they, I would send SOMETHING back. But all the same – I hope other sites take this approach. It could be the end to all the annoying flash ads I get, if instead I got these images everywhere!