For traditional data warehouses, mostly large and expensive server and storage systems are used. In particular, for small- and medium size companies, it is often too expensive to run or rent such systems. These companies might need analytical services only from time to time, for example at the end of a billing period. A solution to overcome these problems is to use Cloud Computing. In this paper, we report on work-in-progress towards building an OLAP cluster of multi-tenant main memory column databases on the Amazon EC2 cloud computing environment, for which purpose we ported SAP's in-memory column database TREX to run in the Amazon cloud. We discuss early findings on cost/performance tradeoffs between reliably storing the data of a tenant on a single node using a highly-available network attached storage, such as Amazon EBS, vs. replication of tenant data to a secondary node where the data resides on less resilient storage. We also describe a mechanism to provide support for historical queries across older snapshots of tenant data which is lazy-loaded from Amazon's S3 near-line archiving storage and cached on the local VM disks. © 2010 IEEE.

Towards a task-based search and recommender systems

TOLOMEI, GABRIELE;ORLANDO, Salvatore;
2010-01-01

Abstract

For traditional data warehouses, mostly large and expensive server and storage systems are used. In particular, for small- and medium size companies, it is often too expensive to run or rent such systems. These companies might need analytical services only from time to time, for example at the end of a billing period. A solution to overcome these problems is to use Cloud Computing. In this paper, we report on work-in-progress towards building an OLAP cluster of multi-tenant main memory column databases on the Amazon EC2 cloud computing environment, for which purpose we ported SAP's in-memory column database TREX to run in the Amazon cloud. We discuss early findings on cost/performance tradeoffs between reliably storing the data of a tenant on a single node using a highly-available network attached storage, such as Amazon EBS, vs. replication of tenant data to a secondary node where the data resides on less resilient storage. We also describe a mechanism to provide support for historical queries across older snapshots of tenant data which is lazy-loaded from Amazon's S3 near-line archiving storage and cached on the local VM disks. © 2010 IEEE.
2010
Workshops Proceedings of the 26th International Conference on Data Engineering, ICDE 2010, March 1-6, 2010, Long Beach, California, USA
File in questo prodotto:
File Dimensione Formato  
ICDEworkshop2010.pdf

non disponibili

Tipologia: Abstract
Licenza: Accesso chiuso-personale
Dimensione 295.95 kB
Formato Adobe PDF
295.95 kB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/35732
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 2
social impact