« March 2007 | Main | June 2007 »

May 2007 Archives

May 16, 2007

What the hell have I been doing? A story in multiple vignettes.

I have been incredibly swamped with building several tangible and intangible products and skills. In plain english, I have been busting my ass at work, reading a bunch of books, going to a gym again, hacking some electronics, and rearranging the furniture in our apartment. None of this is super exciting to anyone but me, but it is work that needs to be done and it has kept me from, well, hacking english rather than code.

Until I have some more "free time", I am going to do some short posts on what I have been up to. Tonight, however, I will be heeding Nate Lawson's call for a BaySec meetup.

May 17, 2007

What the hell have I been doing? Part 1: Poor Man's MBA

Jose Nazario once mentioned to me the concept of a "poor man's MBA", or what a technologist should read in order to understand the business world. I have little motivation, time, or energy to go for a real MBA (one terminal degree is enough), but I do need to absorb as many of the soft skills as possible if I want to, well, not suck as an employee.

With that goal in mind, I read The Five Dysfunctions of a Team: A Leadership Fable to get an idea of how executive teams interact and how decisions are made and indecision occurs, Crossing the Chasm to understand the mentality of marketing folks inside tech companies, and The Innovator's Dilemma to get some pointers on how to push new technologies or innovations through an organization.

For all three books, the language and style that is used is as important as the concepts discussed. I have found that people tend to ascribe you to different schools of thought based solely upon how you frame a discussion, and will often times not accept lines of arguments if you don't frame them in their native belief system. By recasting a discussion in terms they would understand, your argument may be more easily accepted.

A trivial example of this is the use of the term "non-optimal". I tend to use it frequently as a result of my formal training in algorithm optimization. I am very careful to use "unprofitable", or "a poor investment of resources" when I am talking to different business stakeholders about engineering decisions rather than "non-optimal", as the use of the latter communicates that you are thinking with a technicians brain, which traditionally doesn't appreciate the customer motivators, versus that of a business person, who may believe that engineers are short-sighted and don't see "the big picture".

Anyone who has straddled both the technicians and non-technicians world can attest to these issues, so there is no point of going much further on the topic. I should have some other super-awesome-cool content later this week, though.

May 18, 2007

What the hell have I been doing? Part 2: Data Representation

Like it or not, any analysis work that you do is pretty much worthless unless you are able to present the data effectively. Effective data presentation becomes more difficult when new data has to be consumed on a regular basis. Hand-massaging the information is forced to take a back seat to automation, otherwise you (the analyst) will spend your entire life recreating the same report. The data also has to be extremely accessible, otherwise your customers will just not even bother looking at the information.

For example, lets consider the story of some data analyst named... Rudiger. Rudiger has a large volume of numbers about... virus outbreaks locked up in SQL somewhere. Using the tried and true methods acquired as a grad student, Rudiger glues some Perl scripts together followed by smoothing and other cherry-picking using Matlab or, god forbid, Excel. As people ask for the data on a more frequent basis, our intrepid hero tries to come up with more additional automation to make his report generation easier, with graphs e-mailed to him and other concerned parties on a regular basis. He quickly discovers that no one is reading his data-laden e-mails anymore, leaving poor Rudiger to announce conclusions that others could draw from simply looking at a graph provided for him.

What Rudiger doesn't quite realize is that people need to be able to feel like they can own data on their own and manipulate it so that it tells them a story, and not just the story that the graph Rudiger wants them to see tells them. In much the same way that many "technical" (absurdity!) stock analysts will generate multiple forms of charts rather than looking at the standard data provided by financial news sites, data consumers want the ability to feel they can draw their own conclusions and interact in the process rather than be shown some static information. There are several interweb startups based upon this very concept.

For those of you who haven't figured this out by now, I'm Rudiger. Rather than send out static graph after static graph that no one looks at, I learned a web language and threw together an internal website that allows people of multiple technical levels to explore information about virus outbreaks. While it is nowhere as sophisticated as ATLAS, the service tries to emulate Flickr's content and tag navigation structure, where viruses are the content and tags are what we know about the specific threat. The architecture is easy to use and provides both a low barrier to entry, as everyone knows how to use a web page. Also, the "friction" associated with the data is low, as anyone who is really interested can subscribe to an RSS feed which goes right to a web page on the virus; two mouse clicks versus pulling data from SQL.

I am generally more accustomed to writing english or algorithms rather than web code. Frankly, I hadn't produced a web app since PHP 3.x was the hotness. After consulting with some of my coworkers and my old friend Jacqui Maher, I decided to throw the site together using Ruby on Rails. With Jacqui on IM and a copy of Ruby on Rails: Up and Running in hand, I went from a cold start to a functioning prototype in about 2 weeks. I was pretty surprised with how far web development has come since 2000, as ad-hoc methods for presenting data from a table have were replaced with formalized architectures integrated deeply into the popular coding frameworks.

Moral(s) of the story?: Reduce the cost and barriers to analyzing your own data. Put your data in the hands of the consumer in a digestible, navigable form. Remove yourself from the loop. Don't worry, you will still be valuable even when you aren't the go-to guy for generating graphs, as there is plenty of work to go around right now.

[Sidenote: The sad thing is I learned this lesson about reducing the burden of analyzing regularly generated data once before. The entire motivation behind a project I consulted on many moons ago, namely Sourcefire's RNA Visualization Module, was to provide attack analysts with an easy-to-absorb presentation of underlying complex data.]

May 19, 2007

What the hell have I been doing? Part 3: Living Room, Redux


Living Room, Redux
Originally uploaded by Adam J. O'Donnell.
Our living room has been redecorated with furniture acquired from a departing (moving, not dying) friend. This is a big upgrade from the single futon we had, now placed in the office/craft room/second bedroom. I feel like we have an adult's apartment now.

May 20, 2007

I don't know how to blog.

Apparently I pasted this into the mt window twice, once unedited, once with the final edits. I should read drafts before publishing them, I think. Thanks for pointing it out, Aaron.

May 21, 2007

Overtraining and Entrepreneurship

Vipul and I have had a long-running discussion on why certain people have the mentality to build a startup and others do not, and how this seems to be completely uncorrelated with raw intelligence. I argued for a long time that my academic training leads me to abstract new ideas into old frameworks for analysis, as well as leads me to discount the impact of certain new ideas. In some ways, it is a decidedly Hegelistic view of knowledge, but one that has worked for me for some time. Vipul, on the other hand, has argued to me that all the potential impacts of a technology can never be seen at the onset of the company, and that belief in yourself and faith in your concepts and your team will carry you through. I had a great deal of difficulty seeing his point of view until I read Crossing the Chasm and The Innovator's Dilemma, both of which speak to and provide numerous proof points for Vipul's world view.

An article popped up on this weekend's BoingBoing that addressed the very idea that young (under 30), untrained people make better entrepreneurs than older, highly experienced people. The author makes several excellent points about the almost beautiful arrogance of a young engineer to believe they can change the world with a simple idea.

However, amongst other issues, the author doesn't speak much to the extremely high failure rate of new concepts, or the expectation (probability definition) that a given idea will pay off. ("Expectation of career success" is not a new idea, also; there is a good section in Fooled by Randomness on this topic.) Additionally, several of the technologies listed weren't really viable for making money using the business models successfully applied by their usurpers. Yahoo, for example, relied upon human categorization of content and a relatively unsophisticated search algorithm compared to Google, whose PageRank algorithm made the monetization of search, through AdSense, far easier. It isn't that people who have been around the business longer don't see these new opportunities, and it is far more often the case that the organizational structure prevents them from implementing some whizbang new technology that would assist their core business and open up new markets rather than pure ignorance of the changing space.

Anyways, in short, the article is very much worth the read for anyone who asked themselves why did person X start a company while person Y did not.

May 28, 2007

Welcome to Portland!


welcome to portland
Originally uploaded by Adam J. O'Donnell
Murray Kucherawy flew myself and another friend to Portland this weekend. It was the first time I ever visited the city; at times it felt like Philly (industrial decay), and others like San Francisco-lite (hipsters). I was able to visit my old friends Liz and Matt and see Liz's recent sculpture work, which, I feel, tries to use people's compassion for animals to expose the suffering of human beings in the Portland area. The entire show of OCAC student work was far more impressive than anything I have seen in San Francisco as of late. Most of the art that I see on the west coast seems to be relatively low-brow, lacking a message beyond the artist saying "look at me and what I created".

Some random comments:
  • There is a chain called Javaman Coffee.
  • Powell's Technical Books has a more comprehensive selection than most engineering university bookstores.
  • The term "Clear cutting" had no meaning to me until I saw it from the air.
  • Flying in a Cessna was less terrifying than I initially expected.
The entire photoset can be found here.

May 31, 2007

Botnets and Emissions Trading

Many of the customers I engage with at work have been struggling with how to identify and handle the botnet drones. Now, I am going to assume that everyone who either reads or stumbles upon this page has some understanding of botnets and their impact. Over the past several weeks, Estonians have become very familiar with the effects botnet-enabled DDoS attacks can have on everyday life. The networks are the prime source of spam. There is common agreement that yes, botnets are a problem and yes, they need to go away. Who should actually bear the burden of de-fanging these networks?

Disarming the actors behind these attacks involves dismantling the botnets themselves, which is itself an increasingly challenging problem. Older-style bots used IRC servers as a central command-and-control mechanism, making them vulnerable to decapitation attacks by security personnel. Newer systems use P2P-style C&C protocols adapted from guerilla file-sharing systems that are notoriously difficult to control. Other than traffic and content mitigation, which several organizations have proven to be extremely effective, the solution is to take down botnets node-by-node.

So who should eliminate botnets? End users don't feel responsible or even recognize that there is a problem; all they know is that they are using their computer then someone comes along and tells them they are infected with a virus. Service providers (telephone and cable companies) with infected customers aren't really responsible, and pay the cost through outbound bandwidth charges and outbound MTA capacity, which is relatively minor charge compared to the people who are the targets of the attacks. Operating system vendors aren't responsible, because once they sell the product to the customer, they are no longer liable for if, when, or how the customer becomes compromised. Ultimately, the people who bear the largest cost are the ones who are least capable of remediating the source of the spam, namely the service providers of the attack recipients. These actors have to pay for bandwidth for inbound attacks, storage for spam, and support calls from their customers asking why their computer is slow when it is, in reality, a botted system.

In many ways, we have a classic Tragedy of the Commons-type issue. The communal grazing areas, or shared resources that were critical for the working class' ability to make a living, have been replaced by today's fiber lines. Currently the "tragedy" is solved via by bandwidth providers through embargoes of one another: if one service provider gets out of line, the others will block all mail originating from the offender. Recently I have been pondering another possible solution, one based upon financial mechanisms.

While it would likely be impossible to implement, a Cap-and-Trade-style trading system seems extremely appropriate. Similar to carbon trading schemes, a cap-and-trade system for malicious content established between providers would create economic incentives to correctly monitor and reduce the volume of unwanted content that flows between their networks. The system would involve a cap on how much malicious content the parties would deem acceptable to send to one another. Providers who are able to better control the amount of malicious traffic, through expenditures on personnel and products. They can recoup those costs through the sale of credits associated with the difference between their level of outbound malicious content and the agreed-upon cap. Providers who don't police their traffic are forced to buy credits from those who do, which in turn puts a price on their lack of responsibility.

Eventually, the provider may choose to expose this cost of security to the end user, with rebates or special offers extended to users who keep their systems clean and never cause a problem. The end users in turn are incented to keep their machines clean, the Internet would return to the pre-fall-from-eden utopia that it once was, and the world would be a happy place once again.*

* Having providers buy into this concept, building a monitoring infrastructure, setting prices, assembling a market, and maintaining a clearinghouse for credit trades would be pretty damned hard. I don't think this is a practical idea, it does make for a fun thought experiment.

About May 2007

This page contains all entries posted to NP-Incomplete in May 2007. They are listed from oldest to newest.

March 2007 is the previous archive.

June 2007 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.33