I am very curious as to how databases are used in the real world, whether you’re using MySQL and what not, how does it all come together in a real world business? Banking and gaming I know, but is it something that gets stored on data centres and then put into a VM?
I might be overcomplexing this but I understand the good use cases with VMs and containers etc just not with databases.
I’d google, but I’d like a ELI5 due to my smooth brain with these concepts, thank you.
Computing without databases is like going into a grocery store and all of the items are in one great pile. Sure, given enough time (CPU) and resources (RAM) you could find what you’re looking for, it’s horribly inefficient.
Instead, things which are similar are grouped together, like the baking aisle (tables) and if you have to get most of the items for a cake, you know it’s on a specific shelf.
This is nice explanation
ELI5 due to my smooth brain with these concepts
Database is a broad term, nearly any computer system that stores structured data like lists of names or transactions has a database of some kind. Dedicated database platforms like MySQL just make it more efficient / faster / easier to query and store data.
Yeah, even the file system is a hierarchical database, same with Windows registry, Internet domain registry too, not every database has to be SQL or relations based. There also are time based databases like influxdb for storing time based data, like metrics, logs
I think The Manga Guide to Databases would be a good read for you.
Be sure to avoid places where you can get the book for free such as here.
Thank you for the warning. I made sure to bookmark it so if i ever stumble upon it by accident, I’ll immediately know i need to leave.
Yeah, I would never go there. Just like I won’t go here
Thanks for sharing places to avoid so we can stay safe online 💜
I just bought the book off Nostarchpress last month
Hey, that’s awesome! Also, thanks for the warning on that second link. It’s always good to look out for others who don’t know any better.
If I understood your question correctly,
A database is a base for your data, its where your data lives.
A data is the most important thing for most modern applications. You have an account in Lemmy. You have made posts and comments. All that post texts, images, etc has to be recorded somewhere in the server. Thats what a database does.
The Lemmy app we are using on our phones needs to download content from Lemmy so it can be displayed to us. Lemmy might just have one big file full of links, but that’s annoying to have to write code to handle. Or it might have a folder full of files where each file is a post, but that’s also a bit annoying to write code to manage.
It (probably) uses a local SQLite database to store all of the cached posts.
Conceptually, a database is just a place to store things, just like a big text file. The database just handles a lot of the grunt work for you and makes it easier to search, organize, and filter the data.
So anywhere there is data, there could be a database.
Databases aren’t related to VMs or containers.
https://en.m.wikipedia.org/wiki/Database does a good job of describing what a database is. That page also has a lot of examples of uses of databases.
To answer your question about MySQL: in my experience it’s rarely used outside of classrooms or archaic systems. Postgres is a much better general-purpose option for SQL. Sqlite is also nice for different use cases (such as a database on a mobile device).
MySQL is unfortunately still rather popular. Loads of PHP based sites like Wikipedia and WordPress are all MySQL still.
It’s catching up a bit, especially the forks like Percona Server and MariaDB.
Yeah that’s what I was referring to by “archaic”. Pretty much anything using the LAMP stack falls in that category. I don’t generally see new things using it.
A good ELI5 is to imagine a couple of Excel sheets. Each sheet is a “table” and each row is a record. So you’re gonna have a column for the first name, a column for the last name, a column for the email address and so on. That’d be your users table/sheet.
Then you would have another Excel sheets that contains posts. Each post record references the row number of the users sheet so you can cross-reference the user record of the author of the post record.
And so on. It’s a way to store, lookup and retrieve records, usually cross-referencing other records until you have all the information you need to serve a particular request. There’s an index like the table of content of a book that lets you quickly find on which page the record you’re looking for is.
We use databases because they’re engines designed to ensure data consistency, and fast access to the data in a structured manner. Usually that runs on some server that other servers connect to to access the database, so all servers can have the same view of the data. That can be a VM in the cloud, that can be a cluster of VMs in a cloud, it can be Docker containers. It’s just software that manages data so we don’t have to reinvent the wheel everytime we need to store stuff. Then you just ask questions to the database, like, “what’s all the last 50 posts made by this user number” (
SELECT user.username, post.title FROM posts LEFT JOIN users USING (user_id) WHERE posts.user_id = 42 ORDER BY posts.date_inserted ASC LIMIT 50
).ELI5: a database is the “memory” of a program.
Every piece of data that any software uses almost certainly comes from and goes to multiple databases.
Once the data is stored, you can execute “queries” to have powerful access to update many records at a time, read particular records based on their relationship to other records, and so much more.
Your bank balances, your purchase history, your emails, every part of your digital life is almost certainly spread across a constellation of databases.
Bonus Fediverse content:
Lemmy itself uses the Postgres database extensively. Posts, users, comments, votes and more are all individually stored in the database.
Mastodon also uses Postgres. If a post goes up on Lemmy, and a Mastodon server is federated with it, the Lemmy server will send out a HTTP request to the Mastodon server containing the contents of the post. The Mastodon server will use this information to write its own record of the post in its own database.
Regarding your question about VMs: You can run a database inside a VM, or give the VM access to an outside database via queries, or both! You might run SQLlite (a small and excellent embedded database) on the VM to track its local state, while also running queries against a large postgres database to synchronize with other services in the cluster.
not sure what you mean by ‘come together’… for users there would a ‘front end’ user interface, either a website or an application that includes access control and forms and fields, etc. on the ‘back end’ there would be an application server somewhere that connects to the database and lets certain users view/edit certain things as scripted.
Databases are collections of files sorted intelligently[1] and normally unintelligible to humans. But they are files in the end. Then, a system is located between users (humans or other systems) and those files. That system is called a Database Management System (think of the MySQL program).
Then, users can ask the Database Management System questions about the data or give commands that will cause new data or changes in the data, and the system will search the files and/or modify their contents to answer the questions or satisfy the commands.
You see, it is simply a collection of files and a program that can manage those files (so it needs access to them). In the real world, that tuple can be deployed in multiple ways.
Now, the concepts of “program,” “process,” and “files” are high-level concepts. They exist thanks to the Operational Systems, which allow us to speak in those terms. And that’s it: You need at least an Operational System to deploy a database solution.
In the real world, a single big server machine with a lot of space can be dedicated to a database solution. In the LAMP age, this was a normal practice. This example has no virtual machine or container; it is just a plain machine. That’s still an option.
There are more complex solutions that split these main concepts (files, programs/processes) into different “run time” environments (machines, virtual machines, clusters, etc.) where they can be fine-tuned to a specific task.
Now, the whole deal with the databases is to coordinate multiple users while serving as permanent memory. It is the central point where everything else orbits. They try to provide users with an ideal data model that is always updated at the quantum moment so that when users observe it, they can trust that it is the truth.
[1] With “intelligently” I mean that they are sorted in that way (very complicated way) for a reason: to facilitate typical operations, a database solution has to be performed, like searching for data and manipulation, taking advantage of schemas (if it is a database with schema).
I don’t think many businesses use MySQL when they can use PostgreSQL. Oracle is used very often. MSSQL in stupid cases.
You obviously need databases to, eh, store data, index it, process it, access it.
Also, as others say, it’s a wide concept. A file system is a database. In some sense BitTorrent DHT is a database.
MSSQL in Microsoft* cases
FTFY although arguably Microsoft and Stupid and synonyms