Robert's Blog


Monday, May 17, 2010

Nuggets from DB2 by the Bay, Part 1

I had to smile when I saw the thread that Ed Long of Pegasystems started the other day on the DB2-L discussion list. The subject line? "IDUG radio silence." Here it was, IDUG NA week, with the North American Conference of the International DB2 Users Group in full swing in Tampa, Florida, and the usual blogging and tweeting of conference attendees was strangely absent. What's with that? Well, I'll break the silence (thanks for the inspiration, Ed), and I'll start by offering my theory as to why the level of conference-related electronic communication was low: we were BUSY.

Busy. That's the word that comes first to mind when I think of this year's event, and I mean it in a good way. At last year's conference in Denver, the mood was kind of on the down side. Attendance was off, due largely to severe cutbacks in organizations' training and travel budgets - a widespread response to one bear of an economic downturn. Those of us who were able to make it to last May's get-together swapped stories with common themes: How tough is it on you? How many people has your company cut? How down is your business? A lot of us were in batten-down-the-hatches mode, and it was hard to get the ol' positive attitude going.

What a difference a year makes. The vibe at the Tampa Convention Center was a total turnaround from 2009. Attendance appeared to be up significantly, people were smiling, conversation was upbeat and animated, and there was this overall sense of folks being on the move: heading to this session or that one, flagging someone down to get a question answered, lining up future business, juggling conference activities with work-related priorities -- stuff that happens, I guess, at every conference, but it seemed to me that the energy level was up sharply versus last May. To the usual "How's it going" question asked of acquaintances not seen since last year, an oft-heard response was: "Busy!" To be sure, some folks (and I can relate) are crazy busy, trying to work in some eating and sleeping when the opportunity arises, but no one seemed to be complaining. It felt OK to be burning the candle at both ends after the long dry spell endured last year. Optimism is not in short supply, and I hope these positive trends will be sustained in the months and years to come.

In this and some other entries to come (not sure how many -- probably another couple or so) I'll share with you some nuggets of information from the conference that I hope you'll find to be interesting and useful. I'll start with the Tuesday morning keynote session.

The data tsunami: big challenges, but big opportunities, too. The keynote speaker was Martin Wildberger, IBM's VP of Data Management Software Development. He started out talking about the enormous growth in the amount of data that organizations have to manage -- this on top of an already-enormous base. He showed a video with comments by some of the leading technologists in his group, and one of those comments really stuck with me (words to this effect): "You might think that the blizzard of data coming into an organization would blind you, but in fact, the more data you have, the clearer you see." Sure, gaining insight from all that data doesn't just happen -- you need the right technology and processes to make it happen -- but the idea that an organization can use its voluminous data assets to see things that were heretofore hidden -- things that could drive more revenue or reduce costs -- is compelling. As DB2 people, we work at the foundational level of the information software "stack." There's lots of cool analytics stuff going on at higher levels of that stack, but the cool query and reporting and cubing and mining tools just sit there if the database is unavailable. And, data has to get to decision-makers fast. And, non-traditional data (images, documents, XML) has to be effectively managed right along with the traditional numbers and character strings. Much will be demanded of us, and that's good (it'll keep us busy).

Martin mentioned that IBM's overall information management investment priorities are aimed at helping organizations to:
  • Lower costs
  • Improve performance
  • Reuse skills
  • Reduce risk
  • Reduce time-to-value
  • Innovate
He talked up IBM's partnership with Intel, IBM's drive to make it easier for companies to switch to DB2 from other database management systems (especially Oracle), and the "game-changing" impact of DB2 pureScale technology, which takes high availability in the distributed systems world to a whole new level. Martin also highlighted the Smart Analytics Systems, including the 9600 series, a relatively new offering on the System z platform (this is a completely integrated hardware/software/services package for analytics and BI -- basically, the "appliance" approach -- that has been available previously only on the IBM Power and System x server lines). There was also good news on the cloud front: DB2 is getting a whole lot of use in Amazon's cloud.

DB2 10 for z/OS: a lot to like. John Campbell, an IBM Distinguished Engineer with the DB2 for z/OS development organization, took the stage for a while to provide some DB2 10 for z/OS highlights (this version of DB2 on the mainframe platform is now in Beta release):

  • CPU efficiency gains. For programs written in SQL procedure language, or SQLPL (used to develop "native" SQL procedures and -- new with DB2 10 -- SQL user-defined functions), CPU consumption could be reduced by up to 20% versus DB2 9. Programs with embedded SQL could see reduced in-DB2 CPU cost (CPU cost of SQL statement execution) of up to 10% versus dB2 9, just by being rebound in a DB2 10 system. High-volume, concurrent insert processes could see in-DB2 CPU cost reductions of up to 40% in a DB2 10 system versus DB2 9.
  • 64-bit addressing for DB2 runtime structures. John's "favorite DB2 10 feature." With DB2 thread storage going above the 2 GB virtual storage "bar" in a DB2 10 system (after a rebind in DB2 10 Conversion Mode), people will have options that they didn't before (greater use of the RELEASE(DEALLOCATE) bind option, for one thing). DB2 subsystem failures are rare, but when they do happen it's often because of a virtual storage constraint problem. DB2 10 squarely addresses that issue.
  • Temporal data. This refers to the ability to associate "business time" and "system time" values to data records. John pointing out that the concept isn't new. What's new is that the temporal data capabilities are in the DB2 engine, versus having to be implemented in application code.
  • Getting to universal. John pointed out that DB2 10 would provide an "ALTER-then-REORG" path to get from segmented and partitioned tablespaces to universal tablespaces.
  • Access plan stability. This is a capability in DB2 10 that can be used to "lock down" access paths for static AND dynamic SQL.
  • Enhanced dynamic statement caching. In a DB2 10 environment, a dynamic query with literals in the predicates can get a match in the prepared statement cache with a statement that is identical except for the literal values (getting a match previously required the literal values to match, too).
DB2 for LUW performance. John was followed on stage by Berni Schiefer of the DB2 for Linux/UNIX/Windows (LUW) development team. Berni shared some of the latest from the DB2 for LUW performance front:
  • Performance PER CORE is not an official TPC-C metric, but it matters, because the core is the licensing unit for LUW software. It's on a per-core basis that DB2 for LUW performance really shines versus Sun/Oracle and HP systems.
  • SAP benchmarks show better performance versus competing platforms, with FEWER cores.
  • TPC-C benchmark numbers show that DB2 on the IBM POWER7 platform excels in terms of both performance (total processing power) AND price/performance.
  • DB2 is number one in terms of Windows system performance, but the performance story is even better on the POWER platform.
  • Berni pointed out that DB2 is the ONLY DBMS that provides native support for the DECFLOAT data type (based on the IEEE 754r standard for decimal floating point numbers). The POWER platform provides a hardware boost for DECFLOAT operations.
  • DB2 for LUW does an excellent job of exploiting flash drives.
Back to Martin for a keynote wrap-up. Martin Wildberger came back on stage to deliver a few closing comments, returning to the topic of DB2 pureScale. pureScale is a distributed systems (AIX/POWER platform) implementation of the shared data architecture used for DB2 for z/OS data sharing on a parallel sysplex mainframe cluster. That's a technology that has delivered the availability and scalability goods for 15 years. So now DB2 for AIX delivers top-of-class scale-up AND scale-out capabilities.

Martin closed by drawing attention to the IBM/IDUG "Do You DB2?" contest. Write about your experience in using DB2, and you could win a big flat-screen TV. If you're based in North America, check it out (this initial contest does have that geographic restriction).

More nuggets from IDUG in Tampa to come in other posts. Gotta go now. I'm BUSY.

4 Comments:

Anonymous Anonymous said...

Good article.

October 29, 2010 at 6:37 AM  
Blogger Robert Catterall said...

Thanks!

Robert

October 31, 2010 at 5:53 PM  
Anonymous Clark Adams said...

I've focused on the data management software part of this post. I've been looking into that topic and looking for its uses. Generally, it can handle an entire enterprise's data. I've read some articles about it being integrated with accreditation systems, customer relationship management systems, online libraries, and the like. You've mentioned that as time goes by, the amount of data that an enterprise handles increases. I wonder how software developers will be able to handle this. Thanks for sharing this, by the way! I've learned a lot from it.

February 16, 2011 at 2:43 AM  
Blogger Robert Catterall said...

Hey, Clark.

You ask how software developers will be able to handle the increasingly large amounts of data stored in an organization's database. If that database is managed by DB2 (on whatever platform), my expectation is that developers won't have to be too concerned about data volumes. They'll continue to ask for the data they need (or the data changes they want their programs to effect) via SQL statements (and maybe XPath expressions if XML data is stored in the database), and DB2 will take care of the data volume issues (DB2 scales very well, both vertically and horizontally). Now, if a developer wants to use some DB2 SQL features that can improve throughput and CPU efficiency when data volumes are large (e.g., multi-row FETCH and multi-row INSERT statements, or the FETCH FIRST n ROWS ONLY clause of a query), he or she can do that, but dealing with really large data volumes is primarily a concern of DBAs, who will make physical database design and database operations decisions based on data volumes, among other things. I suppose that as data volumes grow, people writing SQL statements will have a greater incentive to avoid tablespace scans (with matching index access being preferred when tables are very large). That can be done by way of EXPLAIN (and visual EXPLAIN tools), which provides access path information for queries that will be executed by DB2.

February 17, 2011 at 10:02 PM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home