Performance concept - Caching

Caching improves performance by retaining frequently used data in high speed memory, reducing access times to data. Some sources that provide data are slower than others. The approximate speed of retrieving data from various sources is shown below. 

Component  Time it takes to fetch 512 bytes of data (ns)  
 CPU Cache

 16

 Main memory  80
 Hard disk  800 + 12,000 seek time
 Flash SSD disk  3,000
 Network interface  50,000
 8 speed DVD  300,000 + seek time and velocity changes

Especially in situations where retrieving data takes relatively long (for instance reading form hard disk, CD- ROM and the network) caching can improve performance significantly.

In case of hard disks, before data can be read it must be located on the disk. Disks are mechanical devices, where the read head must be positioned above the correct track of the disk platter. Then the system must wait for the desired information to spin under the read head. This so-called seek time can take a long time: about 12 ms. When the data is actually read, streaming it is much faster: reading 512 bytes of data (a typical disk block) takes only 0.8 ms.

To speed up the reading of data from disk all disk drives contain caching memory. This caching memory stores all data recently read from disk and some of the disk blocks following the disk blocks that were read. When the data is to be read again, or (more likely) the data of the next disk block is needed, it is fetched from high speed cache memory, and without the seek time overhead.

The same principle goes for DVD drives (and CD-ROM drives, Blue-ray drives, etc). Here seek time includes not only the steps described above, but also the adjustment of disk speed. When the read head of the CD- Rom drive (the laser reading the disk) moves from the beginning to the end of the CD-ROM (or the other way around) the speed of CD-ROM drives changes accordingly. When data at the inner circles of the disk are read, the disk spins at a higher speed than when data is read at the edge of the disk. The drive’s motor must adjust the speed and this takes a considerable amount of time.

While networking connections are much faster, here also cache memory is used. And all CPUs today use internal caching as well (for more information on CPU cache.

Caching can be implemented in several ways, like using disk caching, web proxies, Operational Data Stores, web front end servers and even in- memory databases.

The best known example of using caching to increase performance is disk caching. Disk caching can be implemented in the storage component itself (for instance cache used on the physical disks or cahce implemented in the disk controller), but also in the operating system. A general rule of thumb is that adding memory in servers usually improves performance. This is due to the fact that all non-used memory in operating systems is used for disk cache. Over time all memory is filled with stored previous disk requests and pre-fetched disk blocks, speeding up data management.

Another example of caching is using web proxies. When users browse the Internet, instead of fetching all requested data from the Internet, earlier accessed data can be cached in a proxy server and fetched from there. This has two benefits: the user gets his data faster than when it would be retrieved from a distant web server, and all other users are provided more bandwidth to the Internet, as the data did not have to be fetched again.

An Operational Data Store (ODS) is a replica of a part of a database for a specific use. Instead of accessing the main database for retrieving information, often used information is retrieved from a separate small ODS database, not degrading the performance of the main database. A good example of this is a website of a bank. Most users want to see their actual balance when they login (and maybe the last 10 mutations of their balance). When every balance change is not only stored in the main database of the bank, but also in a small ODS database, the website only needs to access the ODS to provide users with the data they most likely need. This not only speeds up the user experience, but also decreases load on the main database.

In web facing environments storing most used (parts of) pages (like the JPG pictures used on the landing page) at the web front-end server lowers the amount of traffic to back-end systems enormously. Reverse- proxies can be used to cache most wanted data as well.

In special circumstances even complete databases can be run from memory instead of from disk. These so- called in-memory databases are used in situations where performance is crucial (like in real-time SCADA systems). Of course special arrangements must be made to ensure data is not lost when a power failure occurs.


This entry was posted on Dinsdag 19 April 2011

Earlier articles

Infrastructure as code

My Book

DevOps for infrastructure

Infrastructure as a Service (IaaS)

(Hyper) Converged Infrastructure

Object storage

Software Defined Networking (SDN) and Network Function Virtualization (NFV)

Software Defined Storage (SDS)

What's the point of using Docker containers?

Identity and Access Management

Using user profiles to determine infrastructure load

Public wireless networks

Supercomputer architecture

Desktop virtualization

Stakeholder management

x86 platform architecture

Midrange systems architecture

Mainframe Architecture

Software Defined Data Center - SDDC

The Virtualization Model

What are concurrent users?

Performance and availability monitoring in levels

UX/UI has no business rules

Technical debt: a time related issue

Solution shaping workshops

Architecture life cycle

Project managers and architects

Using ArchiMate for describing infrastructures

Kruchten’s 4+1 views for solution architecture

The SEI stack of solution architecture frameworks

TOGAF and infrastructure architecture

The Zachman framework

An introduction to architecture frameworks

How to handle a Distributed Denial of Service (DDoS) attack

Architecture Principles

Views and viewpoints explained

Stakeholders and their concerns

Skills of a solution architect architect

Solution architects versus enterprise architects

Definition of IT Architecture

What is Big Data?

How to make your IT "Greener"

What is Cloud computing and IaaS?

Purchasing of IT infrastructure technologies and services

IDS/IPS systems

IP Protocol (IPv4) classes and subnets

Introduction to Bring Your Own Device (BYOD)

IT Infrastructure Architecture model

Fire prevention in the datacenter

Where to build your datacenter

Availability - Fall-back, hot site, warm site

Reliabilty of infrastructure components

Human factors in availability of systems

Business Continuity Management (BCM) and Disaster Recovery Plan (DRP)

Performance - Design for use

Performance concepts - Load balancing

Performance concepts - Scaling

Performance concept - Caching

Perceived performance

Ethical hacking

Computer crime

Introduction to Cryptography

Introduction to Risk management

The history of UNIX and Linux

The history of Microsoft Windows

Engelse woorden in het Nederlands

Infosecurity beurs 2010

The history of Storage

The history of Networking

The first computers

Cloud: waar staat mijn data?

Tips voor het behalen van uw ITAC / Open CA certificaat

Ervaringen met het bestuderen van TOGAF

De beveiliging van uw data in de cloud

Proof of concept

Een consistente back-up? Nergens voor nodig.

Measuring Enterprise Architecture Maturity

The Long Tail

Open group ITAC /Open CA Certification

Human factors in security

Google outage

SAS 70

De Mythe van de Man-Maand

TOGAF 9 - wat is veranderd?

DYA: Ontwikkelen Zonder architectuur

Landelijk Architectuur Congres LAC 2008

InfoSecurity beurs 2008

Spam is big business

Waarom IT projecten mislukken

Stroom en koeling

Laat beheerders meedraaien in projecten

De zeven eigenschappen van effectief leiderschap

Archimate

Een ontmoeting met John Zachman

Open CA (voorheen: ITAC) - IT Architect certification

Persoonlijk Informatie Eigendom

Webcast

Live computable webcast

Lezing Trends in IT Security

Hardeningscontrole en hacktesting

Kennismanagement

Information Lifecycle Management - Wat is ILM

LEAP: de trip naar Redmond

LEAP: De laatste Nederlandse masterclasses

Scada systemen

LEAP - Halverwege de Nederlandse masterclasses

Beveiliging van data - Het kasteel en de tank

Waarom je geen ICT architect moet worden

Non-functional requirements

Redenen om te backuppen

Log analyse - gebruik logging informatie

LEAP - Microsoft Lead Enterprise Architect Program

Archivering data - more than backup

Patterns in IT architectuur

Tot de dood ons scheidt

High Availability clusters

Hoe geef ik een goede presentatie

Lagen in ICT Beveiliging

Zachman architectuur model

High performance clusters en grids

Redenen om te kiezen voor Open Source software

Monitoring door systeembeheerders

Wat is VMS?

IT Architectuur certificeringen

Storage Area Network's (SAN's)

Systeembeheer documentatie

Wat zijn Rootkits

Virtualisatie van operating systems

Kenmerken van Open Source software

Linux certificering: RHCE en LPI

99,999% beschikbaarheid

Het infrastructuur model

Sjaak Laan


Recommended links

Genootschap voor Informatie Architecten
Ruth Malan
Informatiekundig bekeken
Gaudi site
Byelex
XR Magazine
Esther Barthel's site on virtualization


Feeds

 
XML: RSS Feed 
XML: Atom Feed 


Disclaimer

The postings on this site are my opinions and do not necessarily represent CGI’s strategies, views or opinions.

 

Copyright Sjaak Laan