Blogarithms

Doug Kaye’s Weblog

12/4/2008

Website Editors Wanted

3:57 pm

Here at The Conversations Network we used to recruit website editors and audio engineers on an ongoing basis, but we’ve had such a terrific team for so long, that I can’t even recall the last time I had to put out the word for new volunteers. Now, with a bit of attrition in the ranks of both our website editors and series producers and a new channel on the way, it’s time to add to the team once again.

If you’d like to help us write descriptions for our programs, track down and crop photos and sync the occasional slideshow, here’s your chance. Ours is an all-volunteer organization although we do pay wine money (a little more than beer money) thanks to donations from our paid members. It’s not enough for you to quit your day job, but you’ll get a whopping US$15 for each description you write - US$25 if you also sync a presentation’s slides to the audio.

You’ll find details on TeamITC (as we call it for historical reasons) and our Apprenticeship Program on The Conversations Network web site.

12/1/2008

Bad RSS

8:01 pm

From the TMI category for most of you, but of interest to at least two readers…

We’re getting some bogus RSS feeds, some from otherwise respectable media sources. One class of problems has to do with GUIDs (Globally Unique IDs). In particular, we’re seeing a single GUID being used for different programs, which violates the whole idea of a GUID. We thought we could depend on GUIDs as the sole mechanism of identifying a program, but when a site re-uses its GUIDs, the effect is that the programs appear to change more than once every time the feed is scanned, which drives our updating logic crazy. Here’s what I think we’re going to do:

  • If any <guid> appears more than once in the current <item>s of a feed, we’ll never depend on GUIDs for that feed again.
  • If we’ve never seen such a duplicate GUID, we’ll use each <item>s GUID as it’s supposed to be used: to uniquely identify the program.
  • If we’ve ever found a duplicate GUID for a feed, we’ll look at the <title> elements and the <enclosure url= attributes.
  • If either the title OR the url for an item match one that’s in the database for this feed, we’ll assume the scanned item is just a modified version of the program previously found. The reason is that we tend to occasionally see a site change the title or the media filename of a program, but rarely both at once. (If they do change both at once AND they’ve ever used dupe GUIDs, there’s not really much we can do. We have to assume it is indeed a new program.)
  • IOW, if the GUIDs have been bogus and neither an item’s title NOR media URL can be found in the database, we’ll assume it’s a new program.

The Unkindest Click of All

1:37 pm

Boys and girls, don’t try this at home. I was doing a bit of late-night file reorganization and I wanted to delete a directory of photos from my 3TB Drobo. I opened Drobo in the OS X finder, right-clicked on the directory and clicked on Move to Trash (OS X). Then I noticed that I’d accidentally moved the mouse and clicked on Backups.backupdb, the folder that contains everything in Time Machine. Oops! Too late. That’s right, 130+GB of files now in the Trash.

Okay, I figured, I should be able to recover from that. So I tried to move the directory from the Trash folder back to Drobo. About 30 hours later that failed, so I gave up on saving my Time Machine database. Next step was to run TIme Machine to create a new backup, and that took the better part of an entire day. I’m now in what I hope is the final stage: emptying the Trash and getting rid of those 7+ million files that I can’t use with Time Machine. “Preparing” took about 20 hours. It looks like “Emptying” should be done by midnight or so.

Lest you think that Time Machine combined with redundant disk storage makes for an idiot-proof backup scheme, just consider that you may need to redefine “idiot.” Hard to believe I managed to find and execute a one-click vulnerability to the whole system, but that’s what I get for trying to do anything risky late at night. Be careful out there.

10,000th Program

12:59 pm

We just reached 10,000 programs in the SpokenWord.org database. Just so happens that number 10,000 is a video that I submitted: Clay Shirky at Pop!Tech 2008.

11/30/2008

Taxonomic Challenges

7:06 pm

Over on SpokenWord.org we started with a set of “source” categories such as Conference, Interview, Lecture, Sermon and so on. These categories turned out to be rather useless since very few visitors really cared whether a recording was from a conference or a lecture, for example. What they cared about was whether it was about chemistry or China, which this taxonomy didn’t address.

Next we decided to go with a free-form tagging folksonomy as do many other content sites. For better or worse, we have a semi-automated source of tags: the <keyword> elements of the RSS feeds that supply most of our new programs. Tagging has worked quite well as a search mechanism: a way to actively find content. You can now search for chemistry or China and get reasonable results.

But we also want to present content in a more traditional manner. We want to proactively feature programs (particularly on the home page) in ways that will encourage first-time visitors to listen and view. So we’re thinking of re-instituting a taxonomy of categories in addition to our tags. Now comes the challenge of defining the categories. Here’s the taxonomy we have so far. We want to keep the count to no more than fifteen, so we need to combine where possible, but we want to make sure any spoken-word content fits into at least one category appropriately.

  • business and finance
  • science and technology
  • health and medicine
  • education
  • arts, entertainment, media and literature
  • energy/environment
  • food and drink
  • religion
  • government and politics (current affairs?)
  • sports, recreation & hobbies
  • travel/history
  • comedy (humor)

Anything missing? Remember, these are topical categories, not sources, media, etc.

Update: Here’s another option. We could simply adopt the categories used by iTunes for podcasts. It’s not perfect, but it has the advantage that all of our collections and feeds would be guaranteed compatible with iTunes’ taxonomy. Here’s the list from Apple:

  • Arts
    • Design
    • Fashion & Beauty
    • Food
    • Literature
    • Performing Arts
    • Visual Arts
  • Business
    • Business News
    • Careers
    • Investing
    • Management & Marketing
    • Shopping
  • Comedy
  • Education
    • Education Technology
    • Higher Education
    • K-12
    • Language Courses
    • Training
  • Games & Hobbies
    • Automotive
    • Aviation
    • Hobbies
    • Other Games
    • Video Games

  • Government & Organizations
    • Local
    • National
    • Non-Profit
    • Regional
  • Health
    • Alternative Health
    • Fitness & Nutrition
    • Self-Help
    • Sexuality
  • Kids & Family
  • Music
  • News & Politics
  • Religion & Spirituality
    • Buddhism
    • Christianity
    • Hinduism
    • Islam
    • Judaism
    • Other
    • Spirituality

  • Science & Medicine
    • Medicine
    • Natural Sciences
    • Social Sciences
  • Society & Culture
    • History
    • Personal Journals
    • Philosophy
    • Places & Travel
  • Sports & Recreation
    • Amateur
    • College & High School
    • Outdoor
    • Professional
  • Technology
    • Gadgets
    • Tech News
    • Podcasting
    • Software How-To
  • TV & Film

11/23/2008

SpokenWord.org: The Survey

8:03 am

If you’ve been following the development of SpokenWord.org, it would be very helpful if you’d take our short survey about new features.

11/21/2008

WiserEarth

11:24 am

I’m featured on the WiserEarth homepage today. Well, “featured” is probably too strong a word — it’s just a thumbnail image — but that’s what the email I received from the WiserEarth Chief Editor called it.

11/18/2008

A Liberal Against a Detroit Bailout

11:02 am

I find myself siding with the Repulicans on this one. Sorta weird. Tom Friedman has it right. I can’t see a good reason why we should put taxpayers’ dollars into a dying industry. GM, Ford and Chrysler’s management have done a miserable job, and unless they go through a serious shakeup such as a Chapter 11 bankruptcy, they shouldn’t continue to exist. The writing is on the wall for them. The Emperor has no clothes. As far as investors, lenders — I am one, through funds — and management, they deserve to suffer the consequences of how these companies have been run. The only consituents who may be entiteld to taxpayer assistance are the autoworkers and employees (not executives) of the small suppliers.

This brings up the union, healthcare and pension issues. Messy, to say the least. Personally, I’ve had a love/hate relationship with unions. I believe in the basic concepts of collective bargaining and I recognize that without the ability of employees to organize, employers will exploit them unfairly. But while the major U.S. unions have done an admirable job of growing benefits for their members, there now exist inequities in the benefits and pensions between union and non-union workers in this country. True, the UAW has accepted some concessions in recent years, but the fact is that GM and others are under a tremendous burden in supporting their former employees. This, by the way, is what the Republicans are thinking but not saying. By withholding from Detroit another $25 billion, they’re fostering union-busting through the bankruptcy process.

Although I’m not anti-union, this could ultimately be a good thing. Rather then spending billions on propping up the corporations, I’d like to see Obama and the Congress take this as an opportunity to start providing universal healthcare for all (not just out-of-work auto workers) and beefing up the Social Security System. I think we’re the only country in the world that ties healthcare to employment, which is nuts. And we’ve all seen what will happen if we continue down the Republican path of increased privitazion of retirement benefits.

Let the Big Three go into Bankruptcy. That’s what it’s for. There’s a process that has been tweaked for decades as opposed to the Paulson/Bernanke methodology of writing checks without adequate conditions and then seeing what works and what doesn’t. Let the old and broken institutions crumble. Only then can we get to the bottom and build a more honest and sustainable world. Avoiding the inevitable never works, by definition.

Amazon CloudFront

9:24 am

For the past three months we’ve been beta-testing a new Amazon web service now named CloudFront. The best way to think of CloudFront is a high-performance front end for Amazon’s S3, based upon edge servers located closer to your web site’s visitors.

I’ve been favorably impressed with the new service. To try it out, I went for the low-hanging fruit by simply changing delivery of our CSS and JavaScript files to CloudFront. Performance-wise, these are our most-critical files because browsers run single-threaded while fetching and processing CSS/JS files. After the change, the download speeds of these files fluctuated between 3x and 4x faster than when delivered from our dedicated servers at The Planet in Texas. The key, in looking at the network histograms, is the all-important ‘first-byte delivery time.’ Net improvement: ~750 milliseconds for the load of any of our pages, based on measurements here, 12 miles north of San Francisco. The entire change took only about 15 minutes of effort, including creating a new S3 bucket, copying the files, modifying our code — all the changes were in one file — and establishing a new CNAME, which is optional.

Amazon calls CloudFront a “web service for content delivery,” which isn’t quite the same thing as a content-delivery network (CDN). The difference (for us) is that CloudFront doesn’t (yet?) operate as a pure cache, running off our “origin server” in the same way as we deliver our media files via Limelight Networks, a true CDN. In the case of Limelight, we just maintain the files on our own server, setup a CNAME that refers to Limelight’s edge servers and that’s it. When we add or modify a file on the origin server, that’s all we have to do. Limelight instantly (and I mean that literally) begins to deliver the new version worldwide. We don’t have to do anything manual or otherwise to keep the CDN copies of our files fresh. In the case of CloudFront, you still have to take certain actions (which could be automated, of course) to get new and updated assets from your primary servers pushed to their edge servers.

But while CloudFront may not be a pure CDN at this time, it’s extraordinarily cost-effective. It’s a no-brainer way to speed up almost any web site. For those assets like CSS, JavaScript files, frequently used images, icons, etc., the performance is as good as any CDN I’ve used but at a fraction of the cost. Pricing has two components. For assets served from U.S. edge locations:

  1. $0.170/GB data transfer out
  2. $0.010 per 1,000 GET requests

Charges are lower as volume increases, but higher for delivery from their European and Asian edge locations.

(Aside: One thing I love about all of the AWS services is that by publishing their prices so clearly, they set a very public bar against which all other providers are instantly measured. This happened with S3, and it’s going to happen with CloudFront. Pricing of storage, hosting, servers and now content delivery was previously mysterious and highly negotiable — like by an order of magnitude. AWS has brought transparency to the world of web-service pricing.)

Consider, too, that CloudFront is a completely self-service offering with no minimums, setup costs or hassles once you’re into the whole AWS world. As far as reliability, we never had a single failure or outage that I’m aware of during the entire three-month test period.

11/17/2008

Highest-Rated Programs

9:16 am

Over the weekend I added a Highest Rated tab to the SpokenWord.org homepage. This is something I’ve done before on sites like IT Conversations, and it’s always a challenge. On one hand you want the feature to honestly display the highest-rated programs, but on the other hand you don’t want the list to get stale. You want to avoid the situation in which the most-popular items become increasingly popular and lock themselves into the top slots.

Working with my personal on-call mathematician, Bruce Sharpe, I’ve implemented an algorithm that is at least a good first cut. There are a number of tweakable parameters that have yet to be tweaked. The concept is to discount ratings by two factors: (1) discount each individual rating by the age of that rating; and (2) discount the adjusted average rating by the inverse of the number of ratings the object has received. Highest Rated is therefore influenced by (but not the same as) a popularity index.

At Bruce’s urging, I’m using the tanh() (hyperbolic tangent) function to determine the curves for both discounting formulas. In about 34 years of writing code I can honestly say that’s a first for me. I once wrote an entire floating-point runtime library in assembler language — yeah, that’s a challenge! — but I’ve never had much need for those trig functions myself.

The Highest Rated tab on the homepage currently shows too many programs from IT Conversations because of the recovery from a recent database coding error (mine), but over the next few days as the ratings age, the fairness of algorithms should kick in yielding more valuable data.

11/14/2008

Peter Schiff on the Economy: Stubborn but Right

10:50 am

Fascinating video:

Thanks to Hugh, Derek and Jason.

11/13/2008

What About the Rental Option?

10:17 pm

All the plans we’ve heard to date for solving the housing portion of the financial crisis are focused on keeping people in their homes by reducing the costs of mortgages until even the unemployed can afford one. That’s the kind of populist thinking that got us into this mess in the first place.

Let’s be honest: Not everyone should be a homeowner. Regardless of whom you want to blame for how we got here, some of us are facing mortgage payments we’ll never be able to make even under renegotiated terms and reduced interest rates. Even in what Conservatives call an Ownership Society, those without the cash flow necessary to build equity are better off as tenants rather than be burdened with the debt of ownership.

Instead of the government purchasing bad loans, as Senator McCain once suggested, or buying up the loan derivatives, as the Paulson plan originally intended, or just handing money to financial institutions for them to use for “whatever,” let’s create a program similar to the depression-era Reconstruction Finance Corporation (RFC) to federally fund state and local governments to acquire the underlying properties of defaulted loans at a steep discount and then turn around and rent those homes to the current occupants. Besides, it looks as though it’s going to be impossible to refinance any of the securitized loans. They’ve been bundled, chopped into traunches, then bundled again and there’s no way to figure out who holds which mortgages. (I’ll bet that’s something that won’t be permitted once the dust settles from all of this.) Regardless of who they are, the mortgage holders (lenders, insurers and hedge funds) will feel some of the pain for their indiscretions, but it will stabilize and put a stop to their losses, allowing the credit markets to finally move forward.

People who stand to lose their homes and who would otherwise be out on the street become renters, which of course they should have been all along. This eliminates the problem of those homeowners who continue to pay their mortgages feeling like their neighbors are receiving an unfair bailout. And setting a value on the real estate is far simpler than trying to find the fair price to pay for credit-default swaps on securitized loans. The federal government would set standards for the program and provide oversight.

With an average pre-slump U.S. home price of $215,000 and a 25% discount, $700 billion allows us to acquire nearly four million homes, even including a 10% cost to administer the program. Why fund state and local governments? Because the closer you get to the properties, the better a landlord you can be. Yes, as landlords the cities and states will have to manage and maintain these homes, but that’s much easier at the local level than from Washington. We can learn a lot from both the strengths and weaknesses of the RFC, created in 1932 and rolled into Treasury in 1953.

To make sure our governments don’t stay in the real-estate business for the long term, after a two-year cooling-off period, the homes would first be offered for sale to the then-current occupants, then auctioned randomly over a five-year period to avoid further depressing the market with a sudden glut of even more homes for sale.

We won’t solve the housing crisis so long as we pretend that families who can barely make ends meet can afford the increased burden of building equity. It’s those “affordable” but unrealistic zero-down principal-only loans are what got us into this situation. So long as we pretend that we can make home ownership inexpensive enough for everyone, we’ll never dig our way out of this hole. Allowing people to rent the homes they currently occupy not only keeps roofs over their heads, it’s also simple (as compared to other options) and solves our housing-crisis problems directly. It removes the bad-loan problem from the books of financial instututions without rewarding them for their misconduct.

(Seven weeks ago I blogged a draft of this idea, which I followed up with an op-ed submission to the NY Times. Of course, lacking a Nobel Prize, I wasn’t likely to be successful, but I had to try, right? The above is an updated version of the article I submitted to The Times.)

The First Bit of Magic

8:06 pm

I think I just added the first piece of magic to SpokenWord.org. If you click on the Recently Collected tab, you’ll see a list of the programs most-recently added by members to their collections. But click on the numbers under the images and you’ll see *which* collections those are. Why is this magical? Because that’s the way you’ll find “more like this” — other programs explicitly collected by other members. There’s a lot more of this to come, but this is the first step.

11/12/2008

Homepage Experiment

10:38 pm

Spending an hour on a Skype call with Bruce Sharpe last night gave me some good ideas about the SpokenWord.org homepage. We looked at a few sites together, and with Bruce’s encouragement I decided that SoundFlavor has a lot of good ideas on their homepage. So in the past 24 hours I’ve completely re-done the SpokenWord.org homepage, lifting many ideas from SoundFlavor and other sites. Yeah, the colors are still awful and I can’t design (or use Photoshop) worth a damn, but I think it’s a whole lot better for first-time visitors without sacrificing usability for our experts.

11/11/2008

Location Sound: The Basics and Beyond

9:09 am

Thanks to Israel Hyman on Facebook, I just found this six-year-old article by Dan Brockett entitled Location Sound: The Basics and Beyond. I spot-checked a few of the topics I know most people get wrong, and Dan got them right. Great resource for film, video and audio recording.

Boogie Man Tonight on PBS

8:48 am

PBS’s Frontiline is showing Boogie Man: The Lee Atwater Story tonight The film was shown at both the Democratic and Republican conventions, and both audiences liked it. I guess we all see what we want to see. I had the pleasure of screening this extremely well-done documentary at the Mill Valley Film Festival, and I highly recommend it. You’ll learn a lot about how the U.S. political scene has changed in recent decades.

11/10/2008

SpokenWord.org Alpha 0.4

11:43 pm

A week ago, I blogged about a conceptual breakthrough in the design of SpokenWord.org. In a nutshell, I realized the site would ultimately be built on the relationships of people: one member recommending programs to others, and members following others and their recommendations. The concept survived Doug’s 48-Hour Rule for Conceptual Breakthroughs, and I’ve spent the past five days coding and debugging the first component, which turned out to be collections. Formerly known on the site as Playlists, Collections are just that: collections of programs, RSS feeds and now even other collections that any member can create, curate and share. Yes, allowing collections to include collections without all Hell breaking loose was tricky. Details are in the FAQ.

I’m currently struggling with a few more issues. First is that the site has four classes of “objects” and it’s currently difficult to (visually) tell which is which. We’ve got members, programs, feeds and collections, and they all look pretty much the same. The question is, therefore, how to change their visual representation so visitors can tell which they’re looking at. Seems pretty basic, but I’m visually challenged and I’m too close to the code.

Another big issue is how and what to feature, in particularly on the home page. Experts (and our advisors) tell us that you should be able to play programs immediately from the home page. (That’s not quite the case, yet.) But we also want to encourage the social activities: collecting, recommending, sharing and following. How can we strike the balance between just providing access to the programs and at the same time get people to interact with one another? Like I say, that’s what I’m scratching my head about tonight.

So please come by and check out Alpha 0.4. The new features are still a little rough around the edges, particularly some Ajax issues on IE6 and IE7. But you should be able to register, login, create collections and add programs, feeds or other collections to your collections. You can find great programs by browsing, searching — Advanced Search is in good shape — or just clicking on tags. You can send me feedback, leave a comment here, or (if you’re interested in our planning process) join the strategy discussion and post your ideas there.

11/3/2008

The Best Election Coverage You’ve Never Heard

6:41 am

Like many, I’ve become a cable-new junkie. I spend far too much time flipping back-and-forth between CNN and MSNBC looking for the least-objectionable coverage of the same old stuff, over and over again. But for a refreshingly different perspective, for unique stories that you probably won’t find anywhere else, I’ve been listening to ‘08 Conversations. Today’s show is a great example:

How are young adults dealing with the issues of elections and government? Are they more or less likely to vote? As a part of the first post-September 11th generation, their opinions and actions are thought-provoking. Amina Al-Sadi, a college freshman, is featured in an excerpt from a public radio special produced by and for teenagers.

PRX’s Charles Lane has spent the past five months finding these terrific programs from independent radio producers, working with our own series producer, Joel Tscherne, to bring you a new show every Monday. Visit ‘08 Conversations. I’m sure you’ll find something inspiring, different and a cut above the TV noise.

Conceptual Breakthrough

12:06 am

I think I’ve had a conceptual breakthrough. I’ll let you know, if it lasts 48 hours — known as “Doug’s 48-Hour Rule for Conceptual Breakthroughs.”

The breakthrough is that SpokenWord.org isn’t really about the content as much as it’s about the connections between the people. (Duh. That’s why they call it “social networking.”)  I’ve been focused on submission of programs, creating playlists, etc. Sharing has been based until now on old-fashioned ideas such as “mail to a friend.” The concept of ratings has been applied to programs, feeds and playlists, but I learned years ago that you have to be extremely careful when you actually rate people. It works on eBay, but that’s rare. Saying “good” or “bad” about a person is fraught with problems. But you can rate a person by “voting with your feet” and that’s the starting point of the breakthrough. It’s eBay+Twitter, and we have Twitter to thank for showing all of us the idea of “following” one another. You don’t have to explicitly say you like or don’t like someone. What you need to do instead is to “follow” someone and to be able to see who’s following whom and how many followers and followees someone has.

It clicked for me when Stephen Hill sent me a link a few days ago to lala.com. I urge you to check it out. And don’t just visit — register and add some songs to your collection so you can see how it works and understand this discussion. The folks at lala.com recently made some major improvements to the site and it has some terrific features.

If this survives Doug’s 48-Hour Rule, I think I’ll start by adding general support for the concept of following. I’ll make it easy to follow someone who seems to be promoting content you like. The content you see will then be that which has been rated highly by those that you’ve chosen to follow, and the content will be identified by those sources. “Number of followers” will become a reputation metric, much as it is on Twitter, but I think it will be less polluted by the “star” factor simply because SpokenWord.org isn’t going to appeal to such a general audience. Eventually you may even be able to adjust the weight of the opinions of those you follow: Some people’s opinions would carry more weight than others.

Also like lala.com, I’m considering the concept of “influencers” and the associated point system. I need to think this through in more detail, however, and consider what’s working on other sites like StackOverflow. Twitter is so beautifully simple and lala.com is already s bit confusing. It took me some time, for example, to understand the difference between my collection and my playlists. (Do I really need both?) Maybe following is all we need. TBD.

Once again, the strategic discussion of the future of SpokenWord.org is happening on our Google Group. You’re welcome to join if you want to help us figure this out.

11/2/2008

Calls for Obama

6:39 pm

We went to the local DNC headquarters again today to make get-out-the-vote calls into northern Florida for the Obama campaign. Amazing turnout: maybe 100+ people will mobile phones in every nook and cranny, even sitting on the sidewalk outside. Our little local office alone is aiming for 20,000 calls/day for this last weekend.

Lots of nice people in Florida. Even McCain supporters were polite. One guy said he and his wife had alread voted for McCain, but he wasn’t sure for whom his son (the person I was calling) was voting for because he was in Iraq. My favorite was the woman who said her husband wasn’t at home, but she was going to “make sure he gets his ass down there on Tuesday!” She said he had already waited two hours in line on Thursday, but couldn’t stay longer. A huge percentage we spoke to had already voted. Seems quite common in Florida. Overall, it’s very satisfying to make these calls. Other than giving money, it’s one activity that makes you feel like you’re making a difference. I hope to be back there tomorrow.

Powered by WordPress