Online search, social media, chats. . . how not to get overwhelmed?
In the digitalized world, companies collect data from a variety of sources: transactional systems, sensors, smart and IoT devices, social media, text files, videos, web searches, chatbots, messaging applications, emails. . . Volumes, types and sources continue to boom: Idc estimates 64.2 zettabytes created and replicated in 2020 on a global scale that will reach 180 zettabytes in 2025 (with a 23% compound annual growth rate).
Competition in the globalized market requires our organization to be able to handle lots of data, often in real time. Data management responds to this need by helping to gather data from different sources, organize it, prepare it for analysis and make it accessible – in other words, add value to our business.
Data is, in fact, one of the biggest business assets – perhaps the biggest in the digital economy. But that value has to be extracted. A lot of data does not mean per se a subsequent economic return. Data management techniques – where data goes through preparation, blending and governance – allows you to lay the foundation for an effective analytics business and, therefore, for creating economic value.
From an ocean of data to the Data Lake
The concept of Data Lake was developed with the spread of data that are heterogeneous by source and format and responds to the need to have a single and versatile “container” of business data. Data lakes allow you to import any amount of data in real time: data is collected from multiple sources and moved to the lake in the original format. Companies save time in defining data structures, patterns and transformations, and data scientists or data analysts can access data in its natural form, with greater freedom of movement.
The data lake is a form of data management just like the data warehouse. In the “warehouse”, however, data is first structured and the rules for the analysis are defined in advance; data is prepared in such a way as to fulfil a purpose. In the Data Lake, on the other hand, data is analyzed afterwards, that is, when it is extracted from the “lake”. The purpose remains an open door.
Speaking of Big Data…
Are you really collecting a lot of data, or at least the data you need? Idc states that only a small fraction of the new data created every day is retained. To be exact, only 2% of the data produced and consumed in 2020 was stored for 2021. Storage capacity last year reached 6. 7 zettabytes and will grow by 19. 2% for 2020-2025, although the amount of data grows faster. The question is, how much data do we need?
Not all of the data generated, of course. But we are probably missing some valuable information. As Idc experts put it: “Organizations should consider preparing now to store more data if they want to move along the path of digital transformation and improve business performance by accelerating initiatives based on innovative methods of data analysis.”
According to Idc, storing and valuing data will help companies in three key areas: digital resilience (ability to use digital tools to quickly adapt even to sudden changes); the ability to create innovative products and solutions; and trust-based relationships and empathy with employees, customers, partners and consumers. Data is the “source” of all these benefits.
A little test: do I need the Data Lake?
If the information you collect is increasingly made of unstructured data, the container where you store it should also be unstructured, like the Data Lake. So if your business collects large amounts of data coming from IoT devices, the Internet, social media, and mobile apps, the Data Lake is for you.
If you want to get to know your customers better and define sales strategies based on the collected data (profile, purchase history, interaction with the call center, interactions on social media, etc.) a Data Lake is a valuable tool.
If you are aiming to improve your user experience, making it more personalized and engaging, even in real time, the Data Lake is the data pool you need to perform Analytics and receive the Insights you need.
Do you want to enhance your innovation capacity in products and business models? A data lake, by its very nature, gives data scientists and analysts more freedom of movement. By collecting information from a potentially infinite number of sources, data lakes expand the amount of data analysts can use and are open to different analysis modes.
Is building models to predict probable outcomes and implementing predictive analysis your goal? Data lakes allow you to store relational data, such as operational databases and data from line of business applications, and non-relational data, such as mobile apps, IoT devices, and social media. You can then create historical data reports and apply machine learning techniques to help you achieve your goal.
Do you want to keep up with continuous market and technology innovations and increase your competitiveness? Another significant feature of the Data Lake is that data extraction and analysis is fast. Its agility and speed make the Data Lake a strategic and differentiating element.
The path to Analytics and Actionable Insights
DuneD helps you along the way by extracting value from data. Asset preparation and data management are the basis that leads to Analytics, with the support of Artificial Intelligence and Machine Learning, and, finally, to a solid knowledge to inform your actions (Actionable insights).
DuneD creates Data Lakes that collect and enrich data from all corporate sources, either structured or unstructured. Data is constantly available and accessible for queries by all business units.
We build our Data products – applications or tools that use data to help companies improve their decisions and processes – through a technique called data blending. With Data Blending, DuneD combines data from different sources, in different formats, and creates data sets which are blended with each other and displayed in dashboards to provide you with the “big picture” you need to make a decision. Data blending allows you to aggregate large amounts of data quickly and provides it you with a tool for quality Business Intelligence, helping you to easily and immediately highlight relationships between data sets.
The process used by the artificial intelligence algorithms we apply to analyze data within the data lake and to generate predictions is always made transparent for you. And we give our products the best possible user interface. Accessing intelligence extracted from data is quick and easy.
Food for thought…
- Big data, going from 3Vs to 5Vs. In 2001, Doug Laney (now Data & Analytics Innovation Fellow at West Monroe) described Big Data by 3Vs: Volume, Velocity and Variety. This was a simple model to define the new data generated from the increase of information sources. Today, the Laney paradigm has been enriched by the variables of Veracity and Value and that is why we speak of 5Vs.
- Laney is the author of the book “Infonomics” in which he describes how to manage and monetize information, making it a real business asset to compete in the market. Data lake, how to make it your company’s time machine. In 2015, James Dixon, then CTO of Penthao (now part of Hitachi), coined the term Data Lake. The idea attracted the attention of the media (here is a Forbes article from that year) also thanks to a challenge launched by Dixon on his blog: building an Enterprise Time Machine from the Data Lake. He challenged his readers to find a way to use data to go back to each stage of an application and then “press” forward to analyze it click after click.
- A collection of Ted Talks on Big Data.
- The Alphabet Soup of Data Architectures: podcast of The Data Story series held by two veterans of the Data & Analytics industry, James Serra and Khalil Sheikh. In this “Data Architectures Soup”, James Serra reviews the themes of data lake, data mesh and AI for business intelligence.
- Still not convinced of the business value of data? The podcasts of the Data Skeptic channel are a good start to explore the world of data management.
<a href=”https://it.freepik.com/foto/tecnologia”>Tecnologia foto creata da rawpixel.com – it.freepik.com</a>