Monday, February 11, 2013

Simple Data Management for Small Projects

How can we help small project’s manage environmental data efficiently? By efficient, I mean reduce labor and increase quality.

As environmental data management professionals, we typically serve any of three project scales:

1)  Multi-Facility Enterprise Projects;

2)  Large-scale single facility projects such as Superfund (Super Fun?) projects; and

3)  Small-scale projects such as Phase II investigations or gas station-sized projects requiring regular monitoring of a few wells.

 Most data management systems help us do a great job serving the Enterprise and large-scale projects. But I’ve noticed that in general, it’s tough to serve the small projects efficiently. I often feel that the overhead required to set up a database and generate data summary tables requires more labor than is justified. Hold it- sit down… of course I’m generalizing, and there are exceptions.

For example, I worked in an office where our whole data management system evolved to serve several large Superfund and military base projects where we efficiently generated electronic COCs, tracked samples, loaded EDDs, and qualified hundreds of samples per event. All the field techs, project managers, and chemists, as well as the 7 or 8 labs we regularly worked with, were dialed in to our “system.” A well-established data workflow defined our office culture and consequently, made it easy to start up projects of any size. Once the project was set up, we could quickly generate well formatted report-ready data tables (like the one below) and maps making data management appear effortless. Hmmm… am I dreaming?



Conversely, many offices without a data management culture feature work team silos with widely ranging regulatory or client needs. Unfortunately, these small projects are much harder to support efficiently. The teams (rugged individualists maybe?) likely use very specific Excel data summary tables and are more efficient using a hands-on approach to manage their data- their hands I mean, not data managers. They literally add data by hand to existing Excel templates or workbooks. Yeah- they have to QC every value they hand-enter, but don’t you know they’re going to do that to every database report you propose to generate anyway!

I believe these folks get better value with their approach than if they had to wait around for EDDs to get fixed and loaded, then get tables made (with the correct screening levels), followed by grinding through several iterations to the tables the way they or the client want them. Using the DIY approach, they gain hands-on control of their work. They don’t have to go through a data specialist unfamiliar with their specific needs, and who is not always immediately available as data needs/revisions arise in the 11th hour. In other words, as hard as it is for us data managers to swallow, small teams managing their data by hand literally can give their project better quality and less “friction.”

Of course, I wouldn’t have written this post if I didn’t think there was a better way.

So, how can we help these small teams reduce labor and increase quality? The key is developing a simple set of tools allowing small project teams to manage their data themselves without specialized training. I’m imagining a system where small projects can reap the benefits of querying and reporting well-formatted tables and charts without the inherent complexity of a full-on data management system.

What would those tools look like? I’ll pitch some ideas in my next post.

Tuesday, January 29, 2013

Fourth Annual Environmental Data Management Conference

Howdy-

I'm looking forward to dusting off this blog!!


I'm happy to announce that the fourth annual International Conference for Environmental Data Management (ICEDM) happens May 14-15, 2013 in Portland, Oregon. Better yet, Sarah Wright of deMaximis Data Management, Inc. (ddms) joined me to co-host the conference.  Sarah's a great speaker, knows people throughout the environmental data management community, and has ton of environmental data management experience.


Here's Sarah presenting at the 2012 Conference....
And here's a shot of the Conference Crowd....





So, what happened to the old conference name (NW EDMC)? Sarah and I felt the word "Northwest" gave a bit of a provincial tone and didn't reflect how the conference has grown. After all, we've had attendees from Australia, Chile, Canada, and from all the US including Alaska. We love the new name-"International Conference for Environmental Data Management" (ICEDM)- genned up by William Puknat of the Montana DEQ, one of our 2012 conference attendees. Along with the new name, we have a new website and a twitter account where we post conference news.



Check out the 2012 conference photos here. The attendee list grew in 2012 so we could add a reception following the first day's meeting including free drinks and food. Receptionaires (like that made up word?) had a great time socializing and talking shop with folks from all over the world.

Tuesday, February 21, 2012

3rd Annual Environmental Data Managment Conference (May 15-16, 2012)

The 3rd Annual Environmental Data Management Conference happens this coming May 15-16 in Portland, Oregon.

The Northwest Environmental Data Management Conference (NW EDMC) provides environmental data management professionals with a two-day opportunity to:
  • Share ideas; tools and custom applications;
  • Brainstorm and collaborate on data workflow; and
  • Present data management success stories.
We had a fun and engaged group at the second annual in May 2011 and are looking forward to a larger event coming May 15-16, 2012. Click here to see
the 2011 agenda and presentations.

Click here to Register for the 2012 Conference (May 15-16, 2012).

We're actively seeking presenters for the 2012 Conference. Please email me with a short paragraph pitching your topic.


The 2012 Conference is sponsored by GeoEngineers, Inc. and Earthsoft, Inc.

Wednesday, March 9, 2011

EQuIS consulting, training, and mentoring

Disclosure: You came here looking for data management tips and I'm going to pitch you consulting services- just once. And here's why....

My previous employer was hit hard by the recession and unfortunately, that meant that I was laid off.

Not to worry though- I've already had resumé requests and it seems like as the impending economic recovery has created new job opportunities. In the meantime, I would really enjoy getting out and meeting more environmental data managers in order to learn how you're doing your job, as well as to share some of my tools and techniques with you.

I'd be happy to come by your office for a couple of hours just to meet you (or re-acquaint with you) and share data management ideas with you.  Of course, we can talk on the phone or use web-meeting software as well.  If you need more in-depth help such as data migration or report customization then I'd be honored to help you out as an outside consultant.

You can download my resumé here.

    As always, feel free to contact me with specific data management questions.
    I look forward to hearing from you,
    Dan
    dan@danhigginsdesigns.com

    P.S. The Environmental Data Management Conference in on for June 9 and 10, 2011 at the Edgefield here in Portland. There will more info here soon. Also, Earthsoft will posting information on their community site soon and everyone on Jennifer Rea's contact list will get an email announcement.

    Tuesday, January 18, 2011

    Migrating Historical Data

    We're just wrapping up a big historical data migration into an EQuIS database. The data are needed to support a site remedial investigation.  This migration had an interesting twist I thought I'd share with you. First a little background on data migration.

    In the worst cases historical data comes from a mish-mash of hard-copy reports, EDDs in various formats, and Excel tables. Crunching through these reports and figuring out how the results tie into various location name spellings is tedious and error-prone.  For example, are "S1-MW01" and "SMW1" the same location? Conversely, the best cases provide you with an Access database that's somewhat normalized with unique lists of valid values and obvious relationships between key data tables. Between these two extremes lies the task of crunching a pile of cross-tab Excel tables. 

    The intrepid (curious and determined) data cruncher can take an Excel table and "unwind" it into a columnar database-friendly form using a macro that reads through the table using the same loop within a loop procedural logic I presented here.  The columnar output can be easily rearranged into an ESBASIC EDD using Access. I like Access for this task rather than Excel. In Excel I find I spend a lot of time running the lookup function and creating key columns with formulas that have to be pasted down. In Access I set up a ESBASIC template and then build queries that move the "unwound" columnar data from a table in Access to the ESBASIC EDD. I'll even link my queries to reference tables to get the proper valid values applied to the EDD. Then, I export the data from the ESBASIC template and load it into EQuIS. If the EDD has errors and won't load, I can make an a adjustment or two in the Access query and give it another try. This allows quick, successive iterations of the migration actions to ESBASIC and saves a ton of time. Conversely, it's easy to make a little error when using Excel to make the transition to ESBASIC that forces you to start your multi-step process all over.

    The unusual data migration we're finishing now began as an 800+ page pdf!! No kidding. The cool part was that all the data was collected by one consultant and the tables had clearly been created from a well organized database. No, we couldn't get the database- wouldn't that have been nice? We tried reading the data into a text file with optical character recognition (OCR), but the tables had these alternating light gray bands that messed with the software's ability to read the characters accurately.

    So, we hand entered it all. Really.  For tables like VOCs with lot's of NDs we created a template and then copy-pasted each new table. The new tables values were compared to the pdf and updated in cases where the detection limit varied or a spurious detect occurred. Once the tables were all created I used the Excel unwinding procedure I described above to create columnar data.

    Tables with metals and wet-chem data that contained lot's of detects were a little harder. To handle these I created an Excel data entry form that builds the EDD as the data is entered. The tab order for the each record was set up to match the order the chemicals appear in the pdf tables.


    
    As usual, we used Access to create the ESBASIC EDDs. We loaded the all EDDs into EQuIS, then created crosstab data tables and hand-checked EVERY record against the source pdfs.

    I'm very pleased because the whole process went rather quickly considering the scale of the migration. A couple of years ago I would have bid this much higher that I did.  End the end we came out on budget and have nice database that the project team can count on to start their remedial investigation.

    Friday, November 12, 2010

    EQuIS Enterprise Comes to AMEC

    Greetings:

    In the last month we set up an EQuIS server at AMEC's data center in Atlanta in preparation for EQuIS Enterprise. We're really excited about Enterprise because it will allow project team members across the globe to access their data via the web.  As Portland continues to provide environmental data management support to more and more AMEC offices the biggest request we get is for direct access to current data. As I've written before, we provide weekly tables for some projects, and recently have been providing Excel Workbook front-ends (which have been very popular). However, Enterprise will be light years ahead of those solutions because the team members will have access all sorts of reports, and can even drill down into the data.

    We're also hoping that the ability for labs to email EDDs directly to Enterprise for automated loading will pan out. As has been written much before, a good mature technology is one you don't notice. With that in mind, we use many different labs with widely varying abilities to create good EDDs. I'm anticipating that uploading success will range from seamless to "Holy Cow!! GET it together!". I know of one really good lab that I'm confident will be able to make it work. We NEVER have problems loading their EDDs into EQuIS Professional. I'll definitely shout out kudos to them once I confirm their mad skills uploading EDDs to Enterprise. Conversely, we work with another well-known lab that CANNOT get it right, even when our lab contract includes financial penalties for delays caused by bad EDDs.

    Even without Enterprise set up yet, the move to Atlanta has significantly sped up EQuIS access for the off-site AMEC offices we support.  User access to the old server here in Portland was throttled by our relatively limited bandwidth as compared to the giant data pipes channeled into the data center.

    Enhanced data security is another upside to moving the databases to Atlanta. Here in Portland I'd set up an automated daily data backup script that created a backup volume for any database on the server. However, the backups were stored locally on the server. Because the server didn't have its own AMEC account I was unable to set up an automatic method to transfer the files to another hard drive. So, every Monday morning Medora moved the files manually to another computer in case the server crashed and burned. The local backup files on the server were copied to a tape drive every night as well. Our databases in Atlanta now get a full backup four times a day and the backups are stored at third-party data center in another state. I feel pretty good about that! Four times a day definitely exceeds our needs.

    Back to Enterprise: As of today I've got an virtual client server set up for the Enterprise install. The virtual server will serve as a client to the data server. I've got all the EQuIS databases upgraded to v5.5 and am ready to dive in with the Enterprise set-up. I'll keep you posted.

    Dan

    Thursday, November 4, 2010

    Use VBA to Bold Detects by Looping through an Excel Cross-tab

    The EQuIS cross-tab report is super handy and reliably spits out nicely organized cross-tab reports from EQuIS queries (reports).  Earthsoft even worked out some cool conditional formatting such as underlines and italics to show where reported concentrations exceed screening levels. This in-app formatting is handy for screening, but looks ugly in a final report table.

    For most of AMEC's projects we're asked by the project teams to format the data in Excel using a combination of BOLD for detects and some cell-shading color scheme to call out results equal to or above action levels such a risk screening levels or regulatory levels like MCLs.  Colored tables cost more for report production and of course, you lose the color format's value when the table is photocopied in black and white.  If there's only one action level, you're golden- just use BOLD for detects and a light shade of yellow for an exceedance which shows up as light gray in a photocopy, but still catches your eye on the monitor.

    AMEC's data management team has used Visual Basic for Applications (VBA) to work up all sorts of Excel-based formatting apps for post-processing EQuIS reports. The most complex apps set up page headers and footers with table names, project numbers, and draft stamps, as well as insert table-specific notes in the appropriate location depending on a landscape or portrait table format and layout borders and page breaks.

    At the heart of all our VBA-controlled formats is logic to correctly BOLD detects and manage shading of cells containing data with action level exceedances. Explaining all the logic for the complex formatting described above is beyond the scope of a blog post, but I will show you how to get started using VBA to loop through a cross-tabbed data set and locate and format detects. Next post we'll get into shading results that exceed screening criteria.

    To get a handle on looping through data you'll need to set up way to use VBA to move from cell to cell in a predicable manner. Picture an automated way to move through your table with the up-down and side-to-side arrow keys. I always loop through the data going downward from row to row beginning in the first data column. I'm always consistent in my approach so it's become easy for me to recreate a loop without much thought- there's was a LOT of head scratching at first!!  The idea is to begin at the first column and move downward row by row until there's no more data, then move back to the first row, and shift left to the next column, and so on.

    To do this you'll need to write two nested loops in VBA, one to move through the rows and one to move across the columns.  The code below depends on the lack of "U" qualifier to determine if the cell value is a detect. In first example below (Example 1) the code handles the case where qualifiers are in the same cell as the result value like this.






    In Example 2 I'll show you how to handle cases like this where the qualifiers are in their own column next to the result value.

    To try this out just create a new Excel VBA module in the workbook with your data and drop the code into it. Hopefully my notes in the code will explain how it works...

    The code could be modified slightly to handle a "<" symbol should that be your way to identify ND's.

    Feel free to drop me an email or call if you have problems or need more clarification.

    I'm thinking of putting on a one day Excel VBA for Environmental Data course in AMEC's training center here in Portland. Please let me know if you're interested.

    Dan


    'EXAMPLE 1
     'Sub to Bold detects where result and qualifier are in the same cell
    Sub Loop_thru_cells1()
    Dim j As Integer, i As Integer  'declare variable j for columns and i for rows
    Dim sVal As String    'declare sVal as a variable to hold the result and qualifier text string

    Cells.Font.Bold = False 'clears all bolds in the table
    For j = 5 To 100 ' Starting column = 5; last column with data is 100; change to fit your data
       For i = 4 To 88  'Starting row is 4; last row is 88; change to fit your data
            sVal = Cells(i, j)  'set variable sVal = to the cell contents
            If InStr(sVal, "U") = 0 Then 'search the string for presence of a "U"; this covers "UJ";
                 Cells(i, j).Font.Bold = True ' if a "U" is present then instr(sVal,"U") will be greater than 1 and the cell will be bold.
             End If
       Next i  'next row; when i = 88 this loop ends and the column (j) loop increases by 1
    Next j  'next column

    End Sub



    'EXAMPLE 2
    'Sub to bold detects where result and qualifier are in adjacent cells
    Sub Loop_thru_cells2()
    Dim j As Integer, i As Integer  'declare variable j for columns and i for rows
    Dim sVal As String    'declare sVal as a variable to hold the result and qualifier text string

    Cells.Font.Bold = False   'clears all bolds in table
    For j = 5 To 100 Step 2 ' skips every other column
       For i = 4 To 88  'Starting row is 4; last row is 88; change to fit your data
            sVal = Cells(i, j + 1) 'sets the string variable = to the cell adjacent to the result
            If InStr(sVal, "U") = 0 Then 'search the string for presence of a "U"; this covers "UJ";
                Range(Cells(i, j), Cells(i, j + 1)).Font.Bold = True 'sets both the result and qualifier cells as bold
            End If
            Cells(i, j).Select
       Next i  'next row; when i = 88 this loop ends and the column (j) loop increases by 1
    Next j  'next column

    End Sub