Page 1 of 2 next >>

New Technologies in a Big Data World


The “big data” world is pervasive—to the point in which every organization is now a big data organization. As AI, machine learning, and other high-end analytics become mainstream parts of business operations, new ways of handling data assets are coming to the fore. Industry leaders and experts recently shared their views with Big Data Quarterly on the latest technology developments shaping today’s big data world.

Artificial Intelligence

Of course, AI is the biggest story this year, and industry observers have no shortage of predictions about how AI is reshaping the management and mission of data centers. It’s a two-way street as well; effective data management may be essential to AI, but AI will also pave the way to effective data management. “Artificial intelligence is changing the way data is managed,” said Jeff Foster, director of technology and innovation for Redgate. AI is catalyzing new methods, “from requiring new shapes of data lookup in the form of vector databases, to surfacing the data for training through to governing how data is labeled, classified, and used.”

AI-powered automation of data tasks includes “data cleansing, classification, and categorization,” said Alex Kelleher, chief data officer, president of data and intelligence, at Advantage Intelligence. “Machine learning algorithms can learn from existing data and automatically identify patterns and anomalies, making it easier to manage and process data at scale. This can also help improve data quality and accuracy— machines can identify and correct errors more efficiently than humans.”

AI promises to automate many data management processes—as well as solve the age-old challenge of bringing various data types and sources into a single view of business problems. “All varieties of data at lightning speed, when combined with AI, open new possibilities for enterprises,” said Sunil Senan, SVP and business head of data and analytics for Infosys.

Progress: Industry observers have mixed opinions on the degree of progress made with AI so far. Kelleher, for his part, sees impressive gains being made: AI-driven data management “is already being used across many industries, and these approaches are now fairly mature,” he said. However, there is agreement that AI is still in its early stages, “further in thought than it is in implementation, to be honest,” Foster related. Even among implementations, “63% of AI models function only at basic capability, are driven by humans, and often fall short on data verification, data practices, and data strategies,” said Senan.

Challenges: Potential issues in AI-driven data management include “concerns around data privacy and security, ethical considerations related to the use of AI, and the potential for job displacement due to increased automation,” said Kelleher. The “accuracy and reliability of AI algorithms, particularly in cases where they are trained on biased data,” also needs to be addressed. The availability of AI development and implementation skills also may hamstring rollouts, he added.

Further, infrastructure may not be ready for AI. “Current LLMs require high-performance GPUs with an associated cost premium,” Foster cautioned. “Over time, this will decrease, but it’s specialized hardware at the moment.”

Business benefits: Ultimately, AI may be transformative across many organizations. “The use of AI-powered data automation technology can deliver many benefits, including improved operational efficiency, increased accuracy and consistency in data processing, and the ability to extract insights and make decisions more quickly,” said Kelleher. “It can also enable organizations to automate certain tasks and processes, freeing up our human resources to focus on higher-level tasks that require more creativity and critical thinking.”

Generative AI itself will “solve content-generation, promising radical change across the industry, and it will be interesting to see how far and deep that change goes,” said Foster.

Distributed SQL Databases

Databases are continually evolving, and the latest generation of relational databases is that of distributed SQL databases, which are single relational databases that replicate data across multiple servers and environments. This provides organizations with “the ability to enjoy a modern, end-to-end data solution from app to database to infrastructure,” said Karthik Ranganathan, founder and CTO at Yugabyte. “Organizations are spending billions to modernize applications and infrastructure—fueled by the need for agile development, global scale, high availability, and reduced costs. But what about core database? Until recently, not much had changed. Distributed SQL databases combine enterprise-grade RDBMS capabilities and familiar PostgreSQL interfaces with the horizontal scalability and resilience of cloud native architectures.”

The advantage with distributed SQL databases is they “update the core data layer powering organizations’ most important applications,” Ranganathan explained. “Distributed SQL helps to simplify applications and allows developers to focus on core, valuable features. Native distribution of data provides easy growth and scale while delivering high availability as node, zone, and region failures are seamlessly handled. Global applications can be supported without sacrificing customer experience and app performance.”

Progress: Distributed SQL databases are still relatively new on the scene, “but are one of the fastest-growing database technologies today, and are ushering in the first major database architecture change in decades,” said Ranganathan. “Product catalogs, financial transactions, and user identity management all depend on relational databases. However, legacy solutions don’t provide the scale, availability, and low cost that modern companies need.”

Challenges: The emerging market for distributed SQL is crowded, Ranganathan cautioned. “Organizations are overwhelmed by choice and may simply choose the easy path of staying with their familiar, legacy solutions.” In addition, he added, “the widespread adoption of any new technology requires both core capabilities but also a larger investment around the processes and people involved.”

Business benefits: Distributed SQL databases introduce a major evolution in how databases are architected, “allowing it to uniquely deliver benefits that were not previously achievable in a single database,” said Ranganathan. “The core benefits include simplifying applications, providing native resiliency and scalability, and enabling global expansion.”

Digital Twins

Digital twins—or online replicas of facilities, systems, or organizations—are being built and used on top of the data now flowing in from every corner of the enterprise. “Digital twins are dramatically impacting how data is managed and delivered,” said Mike Campbell, chief product officer at Bentley Systems. “As a digital replica of any asset, system, or process, this technology orchestrates and integrates data, allowing easy access for multiple parties/departments to review and analyze information in order to gain insights and make more informed decisions.”

Digital twins combine historical and real-time data “to create predictive models that provide an outlook on future performance,” Campbell added. “This can improve efficiency, optimize operations, provide ongoing maintenance recommendations, and more. Digital twins can be applied everywhere, from manufacturing and construction to supply chain and logistics management. Examples of digital twins currently in use include buildings, roads and bridges, electric grids, water networks, and even entire campuses and cities.”

Progress: Industries including “construction, engineering, and manufacturing are already realizing the benefits of digital twin technology, but there is still so much untapped potential,” said Campbell. “As the technology continues to evolve and become more accessible, we will see more widespread adoption and use in the coming years, particularly with critical infrastructure projects.”

Challenges: As with any new technology-intensive initiative, it’s important to work closely with the business and plan a digital twin implementation. Digital twin technology touches many parts of the enterprise, and thus requires special attention to detail. “Even after projects are underway, digital twin technology can be implemented, leveraging everything available to create predictive models,” said Campbell.

Business benefits: Digital twin technology “streamlines processes, fosters collaboration, and optimizes operations across the board,” said Campbell. “Digital twins enable efficiencies across all lifecycle stages—from design and construction through operations and maintenance.”

This, in turn, provides insights “that allow for new revenue streams, automation of select processes, enhanced skill sets, an increased return on investment, quality of deliverables, and much more. Additionally, digital twin technology can provide real-time feedback on an asset’s performance, allowing for continuous improvement and optimization throughout the entire lifecycle. This is really just the beginning for digital twin technology.”

Page 1 of 2 next >>


Newsletters

Subscribe to Big Data Quarterly E-Edition