Data silos days are numbered. Businesses that organize around products or business functions and not data will disappear.
The development and adoption of service oriented architecture (SOA) and cloud computing created a steady stream of incentives to abandon data silos. However, data inertia — the difficulties of easily giving access or moving data — and organizational structures have posed significant stumbling blocks in exploiting them fully and organizing businesses around data at the same time.
A Turning Point
We are at a turning point with big data. SOA and cloud computing were about efficiencies — which could be ignored — big data is a strategic shift and will transform businesses in a way that adopters will supersede the stragglers.
SOA removes duplication of business logic and encourages reuse through business process management and orchestration. SAP in 2001 imagined a global service marketplace where companies exchange and orchestrate services, switching and adopting uncompetitive or unreliable services (interleaving internal and third party ones). This has not come true in the many years SOA has been promoted by the likes of SAP, IBM, Microsoft, and marketplace efforts have largely been abandoned. In reality, large corporations struggle with service discovery, standardization and subsequently re-use — even internally. The interest in SOA has subsequently waned.
Cloud computing and virtualization has met with more success. It commodifies and optimizes the utilization of data processing and storage. Private, hybrid and public cloud patterns target variations of use cases to balance privacy, compliance, and base- and burst-load situations, for example. The offerings range from Infrastructure (e.g., (virtualized) hardware), to Platform (e.g., database systems) and software services (e.g., email). More than 80 percent of companies now use a cloud service. The adoption varies though with start-ups often embracing the cloud as an option for their entire infrastructure and services to iterate fast and cheaply in exchange for higher operational costs.
Established businesses with legacy computing infrastructure and compliance requirements like banks have been much slower in adopting the cloud. Hybrid cloud solutions and special secure cloud offerings are targeting these businesses, to offload or speedup non-sensitive data processing for example. Notably this does not necessitate a unified approach to data management including the abandoning of data silos and refocusing the business around data.
Unsurprisingly, many companies struggle to extract value from even their own data let alone combine it with external data. Jenna Danko, product marketing manager at Oracle Financial Services stated that,
The [Financial Sector] sector is one of the most data-driven industries, and analysts estimate that somewhere between 80 percent and 90 percent of the data that exists within a bank’s data centre is not analyzed — data from call logs, weblogs, emails and documents.”
This highlights the untapped potential these businesses have and the threat it could be to them when competitors exploit it before them.
The Big Data Shift
Businesses that don’t adopt cloud computing can usually continue to compete in the marketplace without adopting them. They are in effect cost saving measures, which can make a business more competitive, but they rarely are a deciding factor. Moreover, even data-driven industries still have yet to achieve a complete exploitation of their own data and an extensive correlation of it with relevant external sources to achieve a holistic data view.
Big data changes this. The focus on collecting, sharing and generating insight from large volumes of data from within and beyond a business promises to improve existing products and processes. Importantly, it permits new products and processes, which will set it apart from the competition.
Big data requires breaking down organizational barriers, data silos and technical challenges to combine all the relevant data — internal and external — for products, analytics and insight. The technical challenges have been solved with numerous tools most of which have reached a high level of maturity in the last years, e.g., Hadoop, Hive, Pig, Flume, Sqoop, HBase, Oozie, Storm, to name a few prominent open source ones. They provide an ecosystem that can Extract, Transform and Load (ETL) data of any size and from any source easily and inexpensively.
See on www.cmswire.com