Application security, Channel, Content

The State of Hadoop for Big Data Partners and Data Scientists

I just landed at Strata+Hadoop World in New York -- a major gathering of data scientists and information managers within the big data ecosystem. My mission: Determine the state of Hadoop for partners and data scientists.

Large IT consulting firms like Accenture and Deloitte are here. And boutique consultants are certainly navigating the show floor. But I sense the Hadoop wave won't reach SMB channel partners for another two years or so -- perhaps via public clouds rather than on-premises opportunities.

I'll be testing that hunch during a range of meetings today. I'll be sure to update this blog after each meeting, recapping the conversation for channel partners that are trying to wrap their arms around big data. I'll also be gathering enterprise-centric insights for Source Media's Information Management website.

Plus, public cloud providers like Google, Microsoft and Amazon are promoting a range of new Hadoop and big data tools. And on-premises equipment providers like Cisco Systems, EMC, IBM and others are striving to promote converged infrastructure (compute, storage, network) for big data Hadoop applications.

Check back every hour or so for some anecdotes and revelations from each meeting.

Meeting 1: MemSQL CEO Eric Frenkiel (pictured) and Chief Marketing Officer Gary Orenstein

The database-centric company just launched Streamliner, which allows customers to gather real-time data pipelines for real-time analysis. It works with Spark, though MemSQL is focused on a range of technologies beyond that open source option. An example customer: An energy company in Oklahoma is using Streamliner and MemSQL to monitor very expensive drill bits during the fracking process.  Drilling adjustments based on sensors that monitor the bit's performance, temperature and other real-time analytics -- can be made instantly.

MemSQL Partner Strategy: MemSQL launched in 2011, emerged from stealth mode in 2013, and now has about 70 employees. The company is already working with a range of partners and OEMs, and expects to launch a formalized partner program in 2016 for ISVs, VARs and more. Partner opportunities cut across three areas -- first, migrating Oracle and Microsoft SQL Server customers to MemSQL for improved scalability; second, overhauling existing Data Warehouses with MemSQL; and third, new big-data applications that require real-time performance, the company asserts.

"The beauty of our technology is it doesn’t require you to go get a certification; the administration of the database is very simple. Anyone who is familiar with Oracle or SQL Server will understand MemSQL in no time," said Frenkiel.

Partners can try MemSQL's free community version to get started, and a MemSQL Ops management dashboard allows MSPs to manage database clusters.

Next Page: Trifacta

Meeting 2: Trifacta Co-founder and CTO Sean Kandel, VP of Marketing Joe Scheuermann and Director of Product Marketing Will Davis.

The company's focus is simply explained: Before you can effectively analyze data and perform analytics -- you need to prepare the data, weed out or fix anomaly information (was a phone number field filled with a zip code?), and ultimately enrich the data. Or, you may need to combine and cleanse data from multiple sources.

Trifacta has about 40 customers -- many of them are big names (Orange Telecom, GoPro, Pfizer, Pepsi Co., P&G and more). And most of the focus, currently, is the Hadoop ecosystem.

Trifacta Partner Strategy: Some MSPs -- such as SoftNet -- offer Trifacta as a managed service. Also, Accenture and regional systems integrators have engaged the company. On the technology front, Trifacta has relationships with the major Hadoop providers plus Amazon AWS and Microsoft Azure.

Watch for the company to push beyond Hadoop in the months ahead, and a formal partner program also is on the way...

Next Page: MapR speeds Hadoop application development

Meeting 3: MapR VP of Marketing Jack Norris

MapR is one of the leading providers of Hadoop. The company's latest move involves OJAI (the Open JSON Application Interface) -- which essentially allows developers to write and adjust Hadoop applications far more recently. Norris presented sample code to me the old way -- and then showed me the alternative code leveraging OJAI.

Basically, OJAI required fewer commands and fewer lines of code to deliver far more application capabilities. Developers can test OJAI now, with general availability expected later this year.

MapR Partner Strategy: Check the company's homepage and you'll see clear links for MapR's emerging partner strategy. Moreover, more than 40,000 people have embraced MapR's online training courses -- with IT consultants increasingly in the mix.

Next Page: Hortonworks on Sensor Networks

Meeting 4: Hortonworks VP of Product and Alliance Marketing Matt Morgan & Senior Director of Global Product Marketing Wei Wang

Hortonworks is another leading provider of Hadoop. Much of the company's focus at the conference involves the Internet of Things (actually, the Internet of Anything) and sensor networks. Hortonworks' challenge: How to manage edge devices and sensors so (A) they gather the right information and (B) send only the required information over the wire back to the corporate data lake/Hadoop storage system?

The answer involves Hortonworks DataFlow -- a new offering powered by Apache NiFi. "HDF is designed to make it easy to automate and secure all types of data flows and collect, conduct and curate real-time business insights and actions derived from any data, from anything, anywhere," the company asserted.

HortonWorks Partner Strategy: In previous conversations, Morgan has described the growing role of OEMs, ISVs and IT consulting firms in the Hortonworks ecosystem. What's next? I suspect MSPs will step in to manage DataFlow connections and related sensor networks.

Bottom Line

Overall, the Strata+Hadoop World conference was packed with attendees. And most case studies involved Global 2000 companies describing true innovations that drive data insights and potential business break throughs.

The major Hadoop providers each have several hundred customers on board at this point. But many startups at the conference have fewer than 50 customers -- meaning that there's still plenty of risk in the market.

Still, the rewards associated with Hadoop seem to be growing by the day. For customers. For distributors. For big consulting firms and integrators. And perhaps as soon as 2016 or so, for mainstream channel partners.


Joe Panettieri

Joe Panettieri is co-founder & editorial director of MSSP Alert and ChannelE2E, the two leading news & analysis sites for managed service providers in the cybersecurity market.