NoSQL For Dummies

Name: NoSQL For Dummies
Brand: Wiley
Price: 23.99 EUR
Availability: OnlineOnly

Adam Fowler(Author)

Wiley (Publisher)

Published on 20. January 2015

456 pages

E-Book

ePUB with Adobe-DRM

System requirements

E-Book

ePUB without DRM

System requirements

978-1-118-90562-3 (ISBN)

from €23.99

Available for download

Watchlist: see prices

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

Get up to speed on the nuances of NoSQL databases and what theymean for your organization
This easy to read guide to NoSQL databases provides the type ofno-nonsense overview and analysis that you need to learn, includingwhat NoSQL is and which database is right for you. Featuringspecific evaluation criteria for NoSQL databases, along with a lookinto the pros and cons of the most popular options, NoSQL ForDummies provides the fastest and easiest way to dive into thedetails of this incredible technology. You'll gain an understandingof how to use NoSQL databases for mission-critical enterprisearchitectures and projects, and real-world examples reinforce theprimary points to create an action-oriented resource for ITpros.
If you're planning a big data project or platform, you probablyalready know you need to select a NoSQL database to complete yourarchitecture. But with options flooding the market and updates andadd-ons coming at a rapid pace, determining what you require now,and in the future, can be a tall task. This is where NoSQL ForDummies comes in!
* Learn the basic tenets of NoSQL databases and why they havecome to the forefront as data has outpaced the capabilities ofrelational databases
* Discover major players among NoSQL databases, includingCassandra, MongoDB, MarkLogic, Neo4J, and others
* Get an in-depth look at the benefits and disadvantages of thewide variety of NoSQL database options
* Explore the needs of your organization as they relate to thecapabilities of specific NoSQL databases
Big data and Hadoop get all the attention, but when it comesdown to it, NoSQL databases are the engines that power many bigdata analytics initiatives. With NoSQL For Dummies, you'llgo beyond relational databases to ramp up your enterprise's dataarchitecture in no time.

All prices

More details

Other editions

Person

Content

Introduction 1
Part I: Getting Started with NoSQL 5
Chapter 1: Introducing NoSQL: The Big Picture 7
Chapter 2: NoSQL Database Design and Terminology 27
Chapter 3: Evaluating NoSQL 59
Part II: Key-Value Stores 95
Chapter 4: Common Features of Key-Value Stores 97
Chapter 5: Key-Value Stores in the Enterprise 105
Chapter 6: Key-Value Use Cases 111
Chapter 7: Key-Value Store Products 117
Chapter 8: Riak and Basho 133
Part III: Bigtable Clones 139
Chapter 9: Common Features of Bigtables 141
Chapter 10: Bigtable in the Enterprise 153
Chapter 11: Bigtable Use Cases 165
Chapter 12: Bigtable Products 171
Chapter 13: Cassandra and DataStax 193
Part IV: Document Databases 199
Chapter 14: Common Features of Document Databases 201
Chapter 15: Document Databases in the Enterprise 213
Chapter 16: Document Database Use Cases 221
Chapter 17: Document Database Products 233
Chapter 18: MongoDB 251
Part V: Graph and Triple Stores 257
Chapter 19: Common Features of Triple and Graph Stores 259
Chapter 20: Triple Stores in the Enterprise 275
Chapter 21: Triple Store Use Cases 283
Chapter 22: Triple Store Products 293
Chapter 23: Neo4j and Neo Technologies 309
Part VI: Search Engines 315
Chapter 24: Common Features of Search Engines 317
Chapter 25: Search Engines in the Enterprise 327
Chapter 26: Search Engine Use Cases 335
Chapter 27: Types of Search Engines 341
Chapter 28: Elasticsearch 353
Part VII: Hybrid NoSQL Databases 359
Chapter 29: Common Hybrid NoSQL Features 361
Chapter 30: Hybrid Databases in the Enterprise 369
Chapter 31: Hybrid NoSQL Database Use Cases 375
Chapter 32: Hybrid NoSQL Database Products 381
Chapter 33: MarkLogic 389
Part VIII: The Part of Tens 399
Chapter 34: Ten Advantages of NoSQL over RDBMS 401
Chapter 35: Ten NoSQL Misconceptions 407
Chapter 36: Ten Reasons Developers Love NoSQL 413
Index 419

Chapter 1

Introducing NoSQL: The Big Picture

In This Chapter

Examining the past

Recognizing changes

Applying capabilities

The data landscape has changed. During the past 15 years, the explosion of the World Wide Web, social media, web forms you have to fill in, and greater connectivity to the Internet means that more than ever before a vast array of data is in use.

New and often crucial information is generated hourly, from simple tweets about what people have for dinner to critical medical notes by healthcare providers. As a result, systems designers no longer have the luxury of closeting themselves in a room for a couple of years designing systems to handle new data. Instead, they must quickly create systems that store data and make information readily available for search, consolidation, and analysis. All of this means that a particular kind of systems technology is needed.

The good news is that a huge array of these kinds of systems already exists in the form of NoSQL databases. The not-so-good news is that many people don't understand what NoSQL databases do or why and how to use them. Not to worry, though. That's why I wrote this book. In this chapter, I introduce you to NoSQL and help you understand why you need to consider this technology further now.

A Brief History of NoSQL

The perception of the term NoSQL has evolved since it was launched in 1998. So, in this section, I want to explain how NoSQL is currently defined, and then propose a more appropriate definition for it. I even cover NoSQL history background in the side bars.

The first NoSQL "meetup"

The first documented use of the term NoSQL was by Carlo Strozzi in 1998. He was visiting San Francisco and wanted to get some people together to talk about his lightweight, relational database.

Relational database management systems (RDBMS) are the dominant database today. If you ask computer scientists who have graduated within the past 20 years what a database is, odds are they will describe a relational database.

Carlo used the term NoSQL because his database was accessed via shell scripts, rather than through use of the standard Structured Query Language (SQL). The original meaning was "No SQL." That is, instead of using SQL, it used a query mechanism closer to the developer's source environment - in Carlo's case, the UNIX scripting world.

The use of this term shows a frustration amongst the developer community with using SQL. Although an open standard with massive common support in the prevalent Relational Databases of the time, the term NoSQL shows a desire to find a better way. Or at least, a way better for the poor old developer reading through complex and long SQL queries.

Carlo's meeting in San Francisco came and went. Developers continued to experiment with alternate query mechanisms. Technology appeared to abstract complex queries away from the developer. A prime example is the Hibernate library in Java, which is driven by configuration and enables the automatic generation of value objects that map directly onto database tables, which means developers don't have to worry so much about how the underlying database is structured - developers just call functions on objects.

There's a cost to using SQL. Complex queries are hard to debug, and it's even harder to make them perform well, which increases the cost of development, administration, and testing. Finding an alternative mechanism, or a library to hide the complexities at least, looked like a good way to reduce costs and make it easier to adopt best practices.

Abstraction gets you only so far, though. Eventually, data problems will emerge that require a completely different way of thinking. Existing relational technology didn't work well with such problems, and the explosion of the growth of the Internet and World Wide Web would give rise to these issues.

Moreover, other key things were happening. In 1991, the first public web page was created, just seven years before the NoSQL "meetup." Yahoo and Amazon were founded in 1994. In comparison, Google, which we tend to think has always existed, wasn't founded until 1998. Yes, there was a web before Google - and before Google, remember AltaVista (which was eventually purchased and shut down by Yahoo!) and Ask Jeeves (now known as Ask.com)?

The specification for the language used for system-to-system communication - XML - was released as a recommendation in 1997. The XSLT specification - used to transform XML between formats - came in 1999. The web was young, wild, and people were still just trying to figure out how to make money with it. It had not yet changed the world.

Amazon and Google papers

NoSQL isn't a single technology invented by a couple of guys in a garage or a mathematician theorizing about data structures. The concepts behind NoSQL developed slowly over several years. Independent groups then took those ideas and applied them to their own data problems, thereby creating the various NoSQL databases that exist today.

Google Bigtable paper

In 2006, Google released a paper that described its Bigtable distributed structured database. Google described Bigtable as follows: "Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers."

Similar to an RDBMS model at first sight, Bigtable stores rows with a single key and stores data in the rows within related column families. Therefore, accessing all related data is as easy as retrieving a record by using an ID rather than a complex join, as in relational database SQL.

This model also means that distributing data is more straightforward than with relational databases. By using simple keys, related data - such as all pages on the same website (given as an example in Google's paper) - can be grouped together, which increases the speed of analysis. You can think of Bigtable as an alternative to many tables with relationships. That is, with Bigtable, column families allow related data to be stored in a single record.

Bigtable is designed to be distributed on commodity servers, a common theme for all NoSQL databases created after the information explosion caused by the adoption of the World Wide Web. A commodity server is one without complex bells and whistles - for example, Dell or HP servers with perhaps 2 CPUs, 8 to 16 cores, and 32 to 96GB of RAM. Nothing fancy, lots of them, and cheaper than buying one big server (which is like putting all your eggs in one expensive basket).

Amazon Dynamo paper

Amazon released a paper of its own in 2007 describing its Dynamo data storage application. In Amazon's words: "Dynamo is used to manage the state of services that have very high reliability requirements and need tight control over the tradeoffs between availability, consistency, cost-effectiveness and performance."

The paper goes on the describe how a lot of Amazon data is stored by use of a primary key, how consistent hashing is used to partition and distribute data, and how object versioning is used to maintain consistency across data centers.

The Dynamo paper basically describes the first globally distributed key-value store used at Amazon. Here the keys are logical IDs, and the values can be any binary value of interest to the developer. A very simple model, indeed.

These two papers inspired many different organizations to create their NoSQL databases. There were so many variations that some people thought it necessary to meet and discuss the various approaches being taken (see "The second NoSQL 'meetup'" sidebar).

The second NoSQL "meetup"

Many open-source NoSQL databases had emerged by 2009. Riak, MongoDB, HBase, Accumulo, Hypertable, Redis, Cassandra, and Neo4j were all created between 2007 and 2009. These are just a few NoSQL databases created during this time, so as you can see, a lot of systems were produced in a short period of time. However, even now, innovation moves at a breakneck speed.

This rapidly changing environment led Eric Evans from Rackspace and Johan Oskarsson from Last.fm to organize the first modern NoSQL meetup. Needing a title for the meeting that could be distributed easily on social media, they chose the #NoSQL tag.

The #NoSQL hashtag is the first modern use of what we today all regard as the term NoSQL. The description from the meeting is well worth reading in full - as the sentiment remains accurate today.

"This meetup is about 'open source, distributed, non relational databases'.

Have you run into limitations with traditional relational databases? Don't mind trading a query language for scalability? Or perhaps you just like shiny new things to try out? Either way this meetup is for you.
Join us in figuring out why these newfangled Dynamo clones and BigTables have become so popular lately. We have gathered presenters from the most interesting projects around to give us all an introduction to the field.

This meetup included speakers from LinkedIn, Facebook, Powerset, Stumbleupon, ZVents, and couch.io who discussed Voldemort, Cassandra, Dynamite, HBase, Hypertable, and CouchDB, respectively.

This meeting represented the...

Content (EPUB)

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

NoSQL For Dummies

Description

All prices

More details

Other editions

Additional editions

Person

Content

Introducing NoSQL: The Big Picture

A Brief History of NoSQL

The first NoSQL "meetup"

Amazon and Google papers

Google Bigtable paper

Amazon Dynamo paper

The second NoSQL "meetup"

System requirements