I get a kick out of seeing "ordinary" information technologies make new kinds of science possible. For example, the ability to move and share digital data is a pretty mundane topic in computer science circles. The mass market entertainment industry (think: iTunes, Netflix) has even brought streaming data and multi-megabyte files into our living rooms--for fun! Despite this dramatic adoption and transformation in some industries, there are still plenty of areas where the effects are being felt for the first time.

I think I can speak for most biologists when I say I never thought I would be worrying about file transfer.  Compute power, yes.  Storage space, maybe.  But file transfer?  Never. Unlike some other scientific disciplines, biology is not a traditionally ‘big data’ science.  Generally, biologists produce data on the scale suited to e-mail attachments.  However, seemingly overnight, biology has been propelled into the ranks of the big data sciences.  Now a biologist can easily find herself confronted with terabytes of data.  Why the change?  The answer lies in the recent quantum leaps in DNA sequencing technology.

I am delighted to write this inaugural post for "GO Behind the Scenes" -- our Dev column in the new Globus Online (GO) blog. Entries could come from any member of the GO development team.  We're looking to provide a small window into what we do to make GO go! This one comes from Stuart Martin, the GO software development manager for the backend.  (The backend is the engine that takes the file transfer requests and moves the bytes from point A to point B.)   Since we like to plan big, we run all of GO on Amazon's EC2.  This way, when the time comes, we will be able to run instances worldwide. 

We are very pleased to see so many of our users utilizing Globus Online to support their scientific research -- this week I heard one user had moved 6TB of data over the course of 2 days. Other users would benefit from hearing more of these stories, to see what your colleagues are achieving and to give you ideas for your own work. So, we've decided to pick a User of the Month to draw attention to particularly impressive, innovative and/or widely applicable usage scenarios that the entire user community should know about. For our inaugural selection, I'm happy to announce May's User of the Month is NERSC (National Energy Research Scientific Computing Center).

As I pondered what tone to set for this blog, I tried to put myself in the shoes of the professional researcher. While I’ve lived that life, the everyday life of the dedicated, full-time researcher today has changed (and is changing) beyond what I’ve experienced, and I feel for you all out there who are challenged day-in, day-out with getting your work done in a rapidly changing world. For many researchers today, it’s all about wrangling massive data using faster and faster computers, while also struggling to keep ahead of the crowd by forging and sustaining ever-more-ambitious interdisciplinary collaborations. Those of you who have access to the right tools for these tasks are in the minority: Sure, big science projects have capabilities for getting and working with the data they need. But the average hardworking independent researcher or smaller lab does not. So the challenge (and opportunity) is to make these capabilities accessible not just to a few “big science” projects but to every researcher everywhere.

Welcome to the Globus Online blog! On these pages we plan to present/discuss/debate a range of topics dear to the hearts of computational researchers: data movement, information sharing and collaboration, SaaS tools for researchers, grid vs. cloud (and who cares), and of course Monty Python. Most importantly, we will highlight stories about – and from – users who have found ways to use Globus Online to improve their work. The goal is to create a resource and forum to help make your lives easier.