Thursday, December 13, 2012

Rabbit Turds: Why I dislike NoSQL and Big-Data

Don't get me wrong on this one.  I love and use (and even teach others how to use) graph database's like Neo4J and others who label themselves "NoSQL' such as MongoDB.   I also don't have anything against someone who owns an ultra large piece of data.  It is the terms themselves and that is the subject of today's rant.

What I don't like is the same kind of nonsense, illogical convention for naming as has been present in the past.  It reminds me of another name that lingers to this day like the stench of a dead woodchuck (Web 2.0).   I wrote in the book "Web 2.0 Architectures" (co-authored with Dion Hinchcliffe and James Governor) that Web 2.0 is perhaps inappropriately named as the convention of [name]-[version_Major].[version_minor] sort of implies an observable state of a state-ful object, more than often a piece of software.   Since the web is a dynamic beast that is constantly in a state or flux with multiple technologies, this type of versioning cannot be applied.  I have written in the past that "there is no Web 2.0 architecture" and have even heard Tim O'Reilly himself get a bit (pardon the pun) riled up about those running around talking about Web 3.0, Web 4.0 etc.  Ooh I Cringe just thinking of it.  Like a people building penthouses on a pile of rubbish.   Try a google search for the term "Web 5.0" and just see the lunacy that exists.

When I first heard the term NoSQL used, it seemed to imply that there was not going to be any "SQL" in this movement but it turns out that the acronym stands for "Not Only SQL".    Hmm.  Let's think this one through a bit.  Not only SQL means that you are not excluding SQL.  It also means you are not explicitly including or even implicitly excluding everything else.  So here is the question.  What is defined by a term that is not exclusive of anything?  It is simply a mathematical set that includes "all". Can a rabbit turd be part of NoSQL?  Sure.  It certainly doesn't stand for "Not Only SQL But Definitely Not Rabbit Turds" or NOSQLBDNRT for short.  The most ironic thing about the term is that while many thinks it means "no relational" models or tables, Carlo Strozzi used the term NoSQL in 1998 to name his lightweight, open-source relational database that did not expose the standard SQL interface.  Maybe NoSQL should mean "I was too busy to add an SQL interface and used a different way to access database operations".  It certainly seems more fitting given the cool technological advances.  I was so inspired by the idea of NoSQL I decided to make the movement it's own graphic!  May I present to you...   NOSQL!





Ok, so for the record, I love the idea of building on Graph Databases.   My company is building a very innovative platform called Formstr built on top of Neo4J.  Neo4J is one of the best technologies I have ever come across.  It is time to maybe shift our thinking a bit and start classifying the technologies accordingly.  Graph Databases use Nodes and Relationships to store data.  They use ASCII art languages like Cypher to query them so there is no real SQL involved.   Other databases use native storage formats like JSON that translates into fast returns.  And hello Spring Data!  Why would you use  SQL in Java when you can use @annotations?   There is a good article written by Dan Sullivan here summing up the differences.  This now gets me into the second part of my rant.

Big Data.

What is Big Data?  Last time I looked, data is stored electronically as streams of ones and zeros. Since these are binary concepts, size is really not a consideration.   The ones and zeros are not bigger or smaller than anyone else's.   Big Data is another term that is vague, ill-conceived and has no real quantification or definition.  Loosely interpreted  I am currently under the belief most people really mean to say "A Very Large Amount of Data" or perhaps if it is bigger, we could make the acronym and say we have a VFLAD.  I'll let you guess what the "F" stands for.

Does Big Data mean a single piece of data or a lot of small pieces of data?  I find that due to the interconnected nature of the internet and the accessibility of open data (defined as Data that is accessible by anyone and anytime without any costs or significant barriers), we are really living in a world of "Abundant Data".  Some say "Interconnected Data" which also is sort of misleading.  Like the lessons learned from the old saying "You can't get there from here", I am tempted to say that all data I can connect to is potentially "connectable" data since I can make the connection.  And yes, that is a VFLAD!

So ends my rant, as mild as they come.  This blogger has better things to put energy into that trying to change people's behavior into using meaningful and well thought out terms.

Wednesday, December 12, 2012

Android Winning the Mobile War?

According to Google Chairman and co-founder Erik Schmidt, Google's Android mobile and wireless device is winning the handset war hands down.   Independent Analyst firm IDC confirms this with the numbers coming in for Q3 2012 that Android enjoyed a 75% market share over iOS's runner up spot at 14.9%.  Microsoft's mobile efforts continue to grow well with a 140% increase in year over year smartphone shipments.


One lesson learned here is pay attention to developers.  Developers like easy.  We like open.  Making a developer jump through too many hoops to public a single application is not making friends.  If you examine the amount of energy required to be able to public your first application, RIM is by far the most tedious.   Apple is also fairly cumbersome to set up and Android has a very low barrier to entry.   Note that once you have made one application, the barrier to subsequent application publishing is still easier with Android.   Developers don't make all the decisions, but they do make recommendations.  If you have ever asked a family member who is a mobile developer "what phone should I buy?", chances are they are telling you to buy an Android phone nowadays.   The silent yet important grassroots groundswell cannot be ignored.

Here are some additional factors that may possibly affect the market share:

1. Quality

Apple makes good products and we're sure they are not diminishing.   Because Apple controls both the hardware and software (iOS), it has a distinct advantage over Android which is used on multiple types of hardware platforms.  Unlike the personal computing market, many users never re-install their phones operating system so it is up to each device handset manufacturer to ensure that Android runs well on their hardware.  On a personal note, the Galaxy III S is one lean, mean beast of a phone and is my favorite.

2. Ease of Programming

 One aspect of this battle that is not looked at very often is the popularity of Java amongst developers vs. Objective C.   Java is (for us anyways) much easier to learn than Objective C.  The syntax used by Java is also more familiar to many JavaScript, C# and ActionScript developers.  Objective C does give you far more control, but also forces more lines of code to accomplish the same thing.

3. Developer Tooling

XCode is used by iOS developer while Android developers are free to use whatever they want.  Since xCode's inception, it has steadily improved.   Eclipse is commonly used by Android developers and has some challenges (R.java anyone???).  Intellij's new IDEA is very promising for Android development.  I currently develop on Intellij 11 and may try out 12 with some new Android features and do a series of tutorials on it.   There are

4. Cost

Android phones should logically be lower in cost as there is more competition. Having said that, the observations are that price if not as much of a factor given carriers often subsidize this with plans.

5. Variety

Android wins here again hands down.  There are simply many different device to choose from rather than sticking to the iPhone.

There are more factors beyond this however the summary is clear.  Android is here to stay and will remain a dominent force in the marketplace.

Tuesday, December 11, 2012

The Fragile Rotting Web

Today Michael Arrington said what the rest of us were thinking. The title of Michael's article is "They Screwed Us. Right Before They Screwed Us Again. #poohead". The TechCrunch story just scratches the surface for me.  Dude - where is my internet?  The answer is that it is rotting from the core.

The rotting internet goes beyond just the mechanics.  We sign up for things expecting we deserve to use a service for free on an indefinite basis.  Think of MySpace.com.  At one time the most popular internet destination, now a ghost town.   Never before has history shown us the focus of a planet shifting at such an incredible pace.  Companies that back ventures like MySpace want to eventually earn revenue for their investment.  Sometimes this means redefining licenses and terms of use or imposing certain limits of the free versions of their offerings.  Michael's rant was legitimate IMO but there is a second side to it.  We're getting what we pay for.  Consider this.  Have you ever paid money to Twitter, Facebook, CraigsList, Wikipedia, Skype, MySpace or Instagram?   No?  As a society we expect the services for free?  The fact is we might all need to ask ourselves why Google is giving us free email services, free blogs this one included), an online alternative to Microsoft Office and more.  Google is driving advertising revenue via these items of course.  Facebook, Twitter and others will need to follow eventually.   The various legal systems we have in place to protect our real world rights are sadly lagging far behind the capabilities of the internet and the evolution of social media.  The overwhelming resistance to control via acts like SOPA and PIPA have shown the while we do complain a lot, we also like our wild west frontier justice.

The internet is rotting from it's technical fragility.  Link rot is most prevalent  caused by the very basic architectural patterns of the web.  Anyone can link to anything else in a unilateral manner.  I can simply type in <a href="http://somelink.com/resource.html">anchor</a>  and I have made a link.  The resource I have linked to has no responsibility to me to ensure that the link target does not change.  When I make the link, I assume all the risks.   The problem is that the protocols and standards used to build the web allow this sort of linking with no heart beat mechanism.  If the link is taken down or the site redesigned and factored differently, my link is now leading to a different resource or is a null pointer.  I highly suspect that my blog contains hundreds of broken links that result in 404's.   The basic architecture is flawed by design yet was the best choice.  The alternative, tightly coupling resources would also be problematic. Nevertheless, it might be a good time to re-examine the basic architectural principles of the world wide web.  While Tim Berners-Lee and others run off in pursuit of their semantic web ideals, the core upon which they build continues to become more and more fragile.

Redirects should save us though right?  No!  Redirects cause time lapses and are like temporary bandaids.  I redirect is similar to forwarding your mail when you move.  I just ran the W3C link validator and checker on my former employers site and found many redirects.

The conclusions is inescapable.  As the web grows, domain ownership changes and resources become more intertwined with each other, we will experience more fragility.  The social media giants of today will strive to redefine how far they can push users to extract the maximum revenue from their investment.  Those who push to on user too hard and reconfigure features and functionality will fail like MySpace.    Those who offer simple, one dimensional services and are out to capture marketshare will proliferate overnight like join.me, forsmtr and others.