What is Big Data?

According to IBM, every day 2.5 quintillion bytes of data are created, and 90% of the world’s data is created in the past two years. Because of the popularity of sensors, mobile telephony, surveillance cameras, RFID tags, social networks, digital photographs and video, etc. the amount of generated data getting larger each day.

The annual world's effective capacity to exchange information through telecommunication networks is shown in the next picture, where one Exabyte is one billion Gigabytes.

2012-11/big-data.jpg

Figure: Big Data

Most of this data is unstructured, meaning that the data is not stored in databases, but in emails, text documents, spreadsheets, etc.. To make efficient use of this data is quite a challenge. In some cases, the amount of data coming into an organization is too large to even store.

Big data is about the search, processing and storage of data that is increasing in volume, velocity (the speed at which data is transported through a system), and variety (the types of data) – also known as the three Vs.

One example is the LOFAR telescope, that generates an enormous amount of data each second. Too much to store it on disks. So the only way to process the data is when it is still in transit. The live data stream is processed and only the result of this processing – that is much smaller in size – is actually stored on disk to be analyzed.

New infrastructure solutions are needed to cope with big data, including high speed networks, fast processing nodes, and specialized storage.


This entry was posted on Maandag 31 December 2012

Earlier articles

Quantum computing

My Book

Security bij cloudproviders wordt niet beter door overheidsregulering

Passend Europees cloudinitiatief nog ver weg

Data Nederlandse studenten in cloud niet grootschalig toegankelijk voor bedrijven VS

VS kan nog steeds Europese data Microsoft opeisen ondanks nieuwe regels

The cloud is as insecure as its configuration

Infrastructure as code

DevOps for infrastructure

Infrastructure as a Service (IaaS)

(Hyper) Converged Infrastructure

Object storage

Software Defined Networking (SDN) and Network Function Virtualization (NFV)

Software Defined Storage (SDS)

What's the point of using Docker containers?

Identity and Access Management

Using user profiles to determine infrastructure load

Public wireless networks

Supercomputer architecture

Desktop virtualization

Stakeholder management

x86 platform architecture

Midrange systems architecture

Mainframe Architecture

Software Defined Data Center - SDDC

The Virtualization Model

What are concurrent users?

Performance and availability monitoring in levels

UX/UI has no business rules

Technical debt: a time related issue

Solution shaping workshops

Architecture life cycle

Project managers and architects

Using ArchiMate for describing infrastructures

Kruchten’s 4+1 views for solution architecture

The SEI stack of solution architecture frameworks

TOGAF and infrastructure architecture

The Zachman framework

An introduction to architecture frameworks

How to handle a Distributed Denial of Service (DDoS) attack

Architecture Principles

Views and viewpoints explained

Stakeholders and their concerns

Skills of a solution architect architect

Solution architects versus enterprise architects

Definition of IT Architecture

What is Big Data?

How to make your IT "Greener"

What is Cloud computing and IaaS?

Purchasing of IT infrastructure technologies and services

IDS/IPS systems

IP Protocol (IPv4) classes and subnets

Introduction to Bring Your Own Device (BYOD)

IT Infrastructure Architecture model

Fire prevention in the datacenter

Where to build your datacenter

Availability - Fall-back, hot site, warm site

Reliabilty of infrastructure components

Human factors in availability of systems

Business Continuity Management (BCM) and Disaster Recovery Plan (DRP)

Performance - Design for use

Performance concepts - Load balancing

Performance concepts - Scaling

Performance concept - Caching

Perceived performance

Ethical hacking

Computer crime

Introduction to Cryptography

Introduction to Risk management

The history of UNIX and Linux

The history of Microsoft Windows

Engelse woorden in het Nederlands

Infosecurity beurs 2010

The history of Storage

The history of Networking

The first computers

Cloud: waar staat mijn data?

Tips voor het behalen van uw ITAC / Open CA certificaat

Ervaringen met het bestuderen van TOGAF

De beveiliging van uw data in de cloud

Proof of concept

Een consistente back-up? Nergens voor nodig.

Measuring Enterprise Architecture Maturity

The Long Tail

Open group ITAC /Open CA Certification

Human factors in security

Google outage

SAS 70

De Mythe van de Man-Maand

TOGAF 9 - wat is veranderd?

Landelijk Architectuur Congres LAC 2008

InfoSecurity beurs 2008

Spam is big business

De zeven eigenschappen van effectief leiderschap

Een ontmoeting met John Zachman

Persoonlijk Informatie Eigendom

Archivering data - more than backup

Sjaak Laan


Recommended links

Genootschap voor Informatie Architecten
Ruth Malan
Gaudi site
XR Magazine
Esther Barthel's site on virtualization
Eltjo Poort's site on architecture


Feeds

 
XML: RSS Feed 
XML: Atom Feed 


Disclaimer

The postings on this site are my opinions and do not necessarily represent CGI’s strategies, views or opinions.

 

Copyright Sjaak Laan