Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Consider: it’s totally possible to strip down Postgres until all you have left is an embedded RDBMS of the style of SQLite. (I’m not sure why nobody has done this yet, actually.) Would you call the result “just a file format”?

Such an instance of “embedded Postgres” would still have a huge sprawling catalog (PG_DATA) directory attached to each use of it, so it wouldn’t be contained to a single file. But neither is SQLite contained to a single file—SQLite maintains a journal and/or WAL in a second file.

And, yes, this “embedded Postgres” would require things like vacuuming. But... so does SQLite. Have you never maintained an application that maintains a long-lived “project” as a single SQLite file, where changes are written into this project file repeatedly over a long period? (Think: the “library” databases of music/photo library management software.) SQLite database files experience performance degradation from dead tuples too, and need all the same maintenance. Often “database version migrations” of such software is written to either rewrite the SQLite file into a clean state, or—if it has the possibility of being too big for that to be a quick task—to call regular VACUUM-like SQL commands to clean the database state up.

——

Now, I get what you’re trying to say; the point that you’re trying to make—that SQLite might be a relational database, but it’s not a relational database management system in the sense of sitting around online+idle where it can do maintenance tasks like auto-vacuuming. Unlike an RDBMS, SQLite doesn’t have its own “thread of execution”: it is a library whose functioning only “happens” when something calls into it. By analogy, regular RDBMSes are like regular OS kernels, while SQLite is like a library kernel or exokernel.

But that doesn’t mean that SQLite is a file format! It can be used as one, certainly, but what SQLite is is exactly the analogy above: the kernel of an RDBMS, externalized to a library. As long as you “run” said kernel from your application, and your application is a daemon with its own thread of execution, then your application is an RDBMS.

This can be seen most directly in systems like ActorDB, that simply act as a “transaction server” routing requests to SQLite. ActorDB is, pretty obviously, an RDBMS; but it achieves that not due to its own features, but 99% due to embedding SQLite. All it does is call into SQLite, which already has the “management system” part of an RDBMS built in, just not called unless you use it—just like exokernels often already have things like a scheduler, just not called into unless you as the application layer do so.



Great comment! Respectful of GP and constructively critical.

The other thing I would mention is that SQLite can operate totally in memory which makes it useful without even using it to persist data (say you have a language with a slow dataframe API, just use SQLite in memory to process your data).


FirebirdSQL is pretty close to what you're talking about... it's library can do either embedded or you can run a shared server instance. It's really pretty neat, but on the one side, PostgreSQL is probably better, and on the other SQLite is nicer.

I worked on a project a few years ago, where I chose Firebird so I could use literally the same database on potentially offline sites that regularly sync up to a main office (shared) deployment. I worked pretty well and was still a lot of work.


SQLite is a file format with a familiar API and uses SQL as the logic for searching/adding data to the file.

Approach it exactly the same way you'd approach using a CSV file and all the confusion and overthinking about it goes away. Approach it as a stripped down RDBMS and you end up with all kinds of questions about support for this or that RDBMS familiar service.

You can write your own SQLite file reader/writer. Here's the specs (includes the specs for the Journal and WAL files and semantics as well) https://www.sqlite.org/fileformat.html

Here's an example of somebody who's done this. https://sqljet.com/ - this is not a wrapper on the sqlite C code, this is a re-implementation of that code that is binary compatible with SQLite files.

The Journal file only exists as a temporary file until transactions complete. The .sqlite file you make is the entire atomic file that follows the SQLite file format. The Journal has its own file format. Same goes for the Write-Ahead log.

RDBMSs also manage connection queues, account management, rights and permissions, and so on. Many overcome various OS limitations by providing their own entire file management, fopen(), virtual memory and other subsystems that are tuned to their workloads.

SQLite is a file format. SQLite uses familiar relational paradigms to make it easy to read/write data to the format without having to learn yet another API and domain language. The API code is extraordinarily well tested, and it makes simple complex logic like transaction journaling, indexing and so on.

>Consider: it’s totally possible to strip down Postgres until all you have left is an embedded RDBMS of the style of SQLite.

No! SQLite is not an embedded RDBMS. It's a file format.

If there was a library you could import, and it provided methods to read and write directly to files that PostgreSQL could read/write to and there was nothing else to install, no runtime, no daemons, no servers, etc., then we could pass around self-contained PostgreSQL files to each other. Then PostgreSQL files would be a file format as well.

Have you ever used a library to read/write from a CSV, JSON, JPEG? It's no different than doing so for a SQLite file!

SQLite is a file format.


File formats don't have an API.

> SQLite is a file format with a familiar API and uses SQL as the logic for searching/adding data to the file.

That description is for a library, not a file format. SQLite is a library, that saves to a convenient format and allows you to query the file using the SQL syntax. File formats dont have "logic".

You are arguing the equivalent that Word is a file format. While there is a Word file format, Word itself is an application.


> File formats don't have an API.

So when you read/write to any other file format, you just read/write bytes directly to/from disk and re-implement the parsing and read/write logic in your own code every time?

> File formats don't have "logic".

Every file format has logic otherwise it's just random entropy in an arbitrarily long byte stream on a disk. How to read/parse and interact with that format depends entirely on the logic and scheme for that file. For example, many file formats have an index of some kind that you must read and parse before you can figure out where the other data lives, compressed file formats often store a dictionary, image formats often have a compression/decompression logic that must be followed for reading/writing to them.

> You are arguing the equivalent that Word is a file format. While there is a Word file format, Word itself is an application.

.docx is the file format for Word documents. There are many APIs and programs that can read/write to/from .docx files.


> No! SQLite is not an embedded RDBMS. It's a file format.

You keep saying it's a file format, but it's quite possible to use sqlite without persisting anything to a file at all.

    rc = sqlite3_open(":memory:", &db);


You can read/write CSV files, JSON, JPEG, WAV, MP3, MP4, etc. into memory as well. That doesn't make any of them an RDBMS.

SQLite is a file format. It has a nice API and uses SQL as the domain language for read/write logic. If it didn't use SQL for the logic, would you still be confused?


Being a file format is only one aspect of what SQLite is. SQLite describes itself as "SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine." I think that is a better, more encompassing, description than "it's a file format"

Edit:

If SQLite lacked any ability to persist data to disk, it would still be very useful as an in-process SQL engine for many sorts of problems. Certainly not as useful as it currently is, but nevertheless still useful.

I'd say the file format without the SQL engine, or the SQL engine without the file format, would be like peanut-butter without jelly. Certainly not pointless, but the real magic comes from the combination of the two.


Sure, I can get behind that. The logic that handles the read/writing and SQL parsing is all part of the library for sure.

But that library is absolutely not required nor is SQL. One could build their own library that read/write to SQLite files entirely without any of that if they wished. Some people have done ground up rewrites in other languages other than C, but have more or less stuck with the same internal logic and the use of SQL as the read/write logic.


All other RDBMS products also store their information inside files. I can take a mdb file from SQL Server and copy it to another server and attach it there.

SQLite is obviously a library that stores it's data to a file like millions of other libraries. SQLite is to it's data file the way LAME is to MP3. LAME is not the file format, it's the library.


> All other RDBMS products also store their information inside files.

No they don't. There are many RDBMS (and other DBMSs) that do not store their information inside files.

> I can take a mdb file from SQL Server and copy it to another server and attach it there.

Yes, SQL Server is an RDBMS.

MDB files are files defined by a file format. This format defines something called a database. There are many libraries and other pieces of software than can read/write mdb files (https://jackcess.sourceforge.io/)

SQLite files are files defined by a file format (https://www.sqlite.org/fileformat.html). You can write code that reads and writes to a SQLite database file without using the library. So long as you follow the format specification, you will produce or be able to read from an aribtrary sqlite file that is produced by any code that implements the specification.

The SQLite library is a reference implementation of the specification as well as some sample logic for reading/writing to the files and using SQL to define the interactions with the file. The SQLite library is not required for this, nor is SQL, nor is the logic for ACID compliance.

SQLite is not an RDBMS and offers almost nothing that an RDBMS (or DBMS) might offer. If you wish to use SQLite files with an RDBMS you either have to build the RDBMS yourself, or find somebody else who's done so.

The more you consider the SQLite a file format, the easier it becomes to work with and understand. The more you try to consider it an RDBMS the less it makes sense.

Just because SQL is involved, doesn't mean it's an RDBMS.


SQLite is a a library the implements an embedded database engine. The file that it stores this data in is an artifact, not the interface.


SQLite is a whole library with complicated locking protocol options etc. and works with multiple files of different formats (e.g. separate transaction log).


Since the file format specification is available, you can write your own code that directly reads/writes to SQLite files. You can't even have to use the logs and journaling options. You don't even have to use SQL.


That’s the SQLite database file format. SQLite itself is a library that can deal with multiple files in different formats (transaction log and database file). It also reads .sqliterc and maybe others.


> That’s the SQLite database file format.

You got it! SQLite is a file format.

> SQLite itself is a library that can deal with multiple files in different formats (transaction log and database file). It also reads .sqliterc and maybe others.

Yup you also got it! SQLite is not an RDBMS and shouldn't be approached that way.

The more you try to ram RDBMS ideas into what SQLite is, the more it won't be that. The more you try to treat it like a file format, the more it will be what you want.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: