Transpile Any SQL to PostgreSQL Dialect

jakozaur · on March 18, 2024

I see a lot of these types of tools, each with some essential support, but usually they don't work. I would appreciate what is support what is not.

Tried this tool: go run gitlab.com/dalibo/transqlate@v0.1-beta.2

select TRUNC(SYSDATE, 'month') from dual

-- TRANSLATION ERROR at +1:8: unsupported date format: "month" rule="replace trunc() by date_trunc()" select DATE_TRUNC('month', LOCALTIMESTAMP)exit status 1

Also, at least, it would set the latest git tag, currently command from Readme.md doesn't work:

go run gitlab.com/dalibo/transqlate@latest

go: gitlab.com/dalibo/transqlate@latest: module gitlab.com/dalibo/transqlate@latest found (v0.0.0), but does not contain package gitlab.com/dalibo/transqlate

latest should point to v0.1-beta.2

fljdin · on March 18, 2024

Thank you for your feedback, both issues have been added to the project tracking.

zeroxfe · on March 18, 2024

I asked Google Gemini to do this, and it gave me a seemingly convincing response. Anyone know if it's right?

Prompt:

  translate the SQL statement "select TRUNC(SYSDATE, 'month') from dual" to a PostgreSQL dialect

Response (formatting removed):

  Here is the equivalent SQL statement in PostgreSQL for "select TRUNC(SYSDATE, 'month') from dual" in Oracle:

    SELECT DATE_TRUNC('month', CURRENT_DATE) AT TIME ZONE 'UTC'

  Use code with caution.

  In PostgreSQL:

    DATE_TRUNC is the function used to truncate a date value to a specified unit.

    CURRENT_DATE is a function that returns the current date.

    AT TIME ZONE 'UTC' is used to specify that the truncation should be done in Coordinated Universal Time (UTC). This is optional, but it's a good practice to be explicit about the time zone to avoid any ambiguity.

  The dual table is a dummy table specific to Oracle that is used in simple SQL statements. 

  PostgreSQL doesn't require a table reference in this case.

peterldowns · on March 18, 2024

I view a comment like yours as essentially spam and of negative value. I'm not sure if we (vaguely, hn forum community) have agreed on the etiquette here, but I have to ask why you would bother sharing an AI-generated response if you're not sure if it's correct. What are you getting at? How could this be helpful?

saurik · on March 18, 2024

I am often picky about my/our time being wasted by low-effort comments but in this case I actually was myself coming into this thread assuming the actual tool being discussed was itself going to be an LLM-based solution; and, when it wasn't, all I was thinking was "aren't we in the future? isn't this just something people are going to ask an AI to handle now?".

Hell: a friend of mine claims he asked Claude (using some plug-in in his IDE) to port his app from Firebase all the way to PostgreSQL and it just did it... and it mostly worked well enough to save him a ton of time typing and looking up syntax for stuff, and I think it is now in production?

I do agree, though--and maybe this was your core point--that maybe this "let's ask Gemini" comment should be left by someone who would also know if the answer was correct rather than just saying "no clue but". Although, honestly, even having the output seems useful here (and again: I am someone constantly unhappy about people asking questions they can Google and other such filler content).

irq-1 · on March 18, 2024

I agree: the comment was on-topic, included technical information (dual table?), and AI accuracy/usefulness is of interest to most readers. Maybe in a year I'll think such a comment is bad, but not today.

febed · on March 18, 2024

> a friend of mine claims he asked Claude (using some plug-in in his IDE) to port his app from Firebase all the way to PostgreSQL and it just did it

In now interested in your friends setup. Do you have more details?

saurik · on March 19, 2024

"I’ve been using Phind + their VS Code plugin" ... "I got them to add Claude 3 Opus support"

parhamn · on March 18, 2024

I think it's really reasonable. When I was going through the readme I was (half heartedly) interested in seeing an OPENAI_KEY config variable in the readme.

zeroxfe · on March 18, 2024

Apologies, I was trying to imply that LLMs may be more practical to solve these kinds of problems than hand-built translators (assuming the primary purpose is for engineers using this during development.)

scarface_74 · on March 19, 2024

A transpiler will be consistent and can be correct.

An LLM based solution can be correct but won’t be consistent.

I would much rather a tool be consistent and always incorrect in the same way than inconsistent and sometimes correct. I can create test cases and correct the underlying code. I can’t do that with an LLM

zeroxfe · on March 19, 2024

> A transpiler will be consistent and can be correct.

No disagreement. My point is that it's impractically hard to build especially given the scope and complexity of the SQL dialects of major databases.

irq-1 · on March 18, 2024

Looks like `AT TIME ZONE 'UTC'` should not be used, but I don't think it's illegal syntax.

https://www.postgresql.org/docs/current/datatype-datetime.ht...

> PostgreSQL endeavors to be compatible with the SQL standard definitions for typical usage. However, the SQL standard has an odd mix of date and time types and capabilities...

> Although the date type cannot have an associated time zone, the time type can.

(emphasis mine)

> To address these difficulties, we recommend using date/time types that contain both date and time when using time zones.

cipherself · on March 18, 2024

Interesting, a while back at $EMPLOYER while working in a PoC, I found myself in a situation where I need to take user-provided Postgres SQL queries and run them against one of our APIs. Roughly, I converted the API response to a `pandas dataframe`, I then parsed and transformed the query from the postgres dialect to the duckdb dialect using `sqlglot` and used `duckdb` to query the `pandas dataframe`, coverted it into `json` and returned it to the user.

e12e · on March 19, 2024

Evidence does something similar - dumping data to parquet:

https://docs.evidence.dev/core-concepts/data-sources

https://news.ycombinator.com/item?id=35645464

I whish the implied ETL step was even clearer from the homepage - it's not really feasible for us to dump entire tables to the dev machines for working with production data - but it is an interesting concept.

anitil · on March 19, 2024

That's a really creative way to solve that problem! I would have spun up a temporary pg instance etc, but this is much nicer.

neeleshs · on March 18, 2024

That's interesting. I'm looking to do something similar, but need wire compatibility with postgresql. This allows any postgres client to talk to our service. Didn't have a lot of luck finding a good "middleware"

cipherself · on March 18, 2024

I think this is probably the missing piece for you? https://github.com/jwills/buenavista

neeleshs · on March 19, 2024

Thanks. This looks promising!

hantusk · on March 18, 2024

you could consider hosting an empty postgresql database, compile your code as a postgresql foreign data wrapper and expose it as a view. Nothing is more compatible with the postgres wire protocol than postgresql itself ;)

turbot compiles their steampipe plugins in this way. Example: https://github.com/turbot/steampipe-plugin-net

neeleshs · on March 18, 2024

That's a good idea and we considered FDW (for this and more stuff), but having a middleware makes it more flexible - FDW has limitations around pushdown with subselects, we are still constrained to a single postgresql instance for execution, when in theory we could parallelize (certain) queries across nodes.

souenzzo · on March 18, 2024

https://www.jooq.org/translate/

JOOQ also do that, but it is in JAVA.

fljdin · on March 18, 2024

The `transqlate` project aims to extend this kind of tool without limitation and under open-source license.

We're studying JOOQ from afar, with the ambition of getting closer to its functional coverage.

r1b · on March 18, 2024

Recommend checking out https://github.com/tobymao/sqlglot if you are interested in this capability for other SQL dialects

Tools like this are helpful for:

- Rendering SQL in a consistent way, eg for snapshot testing

- Testing SQL business logic in CI against a dialect with less heavyweight dependencies

- Applying AST transformations to take advantage of dialect-specific optimizations

adeelk93 · on March 19, 2024

And column level lineage!

teslatom · on March 18, 2024

I've been looking for a library that can convert query plans, not just ASTs, between Postgres and other query engines. The reason this is useful is because RLS (row level security) settings are injected into the query plan after the AST is constructed, so the same query string can produce different query plans depending on the security settings of the database.

Substrait comes close, but I haven't found a Postgres Substrait producer. Are there any projects working on this?

pentaphobe · on March 18, 2024

`transqlate` is such a good name - I am very jealous, well done!

ThinkBeat · on March 18, 2024

I see that the support for Oracle is mentioned and clearly present in the source code, is that also at present the only supported database? Any hope for transactSql (Microsoft Sql. Server)

staticlibs · on March 18, 2024

Transact-SQL support (both queries and stored procedures) for PostgreSQL is implemented in Babelfish extensions [1]. It is implemented from the ground up as a "pl/tsql" language for Postgres [2], without using SQL-to-SQL transpilation like JOOQ.

[1] https://babelfishpg.org/ [2] https://github.com/babelfish-for-postgresql/babelfish_extens...

hans_castorp · on March 18, 2024

From the "Roadmap" section of the README:

> more source dialects : mysql, mssql, sybase, etc.

https://gitlab.com/dalibo/transqlate#roadmap

fuzzieozzie · on March 18, 2024

Many tools try to achieve similar eg. https://aws.amazon.com/rds/aurora/babelfish/

One company, CompilerWorks actually had tools that transpiled many SQL dialects. They were bought by Google https://www.crunchbase.com/acquisition/google-acquires-compi...

wendyshu · on March 18, 2024

How is it different from sqlglot?

peter_d_sherman · on March 18, 2024

https://prql-lang.org/

HN Article, PRQL:

https://news.ycombinator.com/item?id=36866861

Google search, site: GitHub, "SQL Transpiler":

https://www.google.com/search?q=site%3Agithub.com+%22SQL+Tra...

GitHub general list of open-source Transpilers:

https://github.com/topics/transpiler

LeftHandPath · on March 18, 2024

I wondered why this sort of thing didn’t already exist. I had to develop for IBM Db2 (for i Series) on a Linux environment (using MariaDB) due to the licensing that stopped us from having a test env.

I wound up writing a very ad-hoc C++ program that would parse a base SQL file and generate the appropriate DB2 and MariaDB dialect versions. Not flexible enough for reuse but it got the job done.

Symbiote · on March 18, 2024

I think Apache Calcite can do this, or something similar.

https://calcite.apache.org/

amw-zero · on March 18, 2024

This in combination with [pg_query](https://github.com/pganalyze/libpg_query) could allow for writing generic static analyzers.

liotier · on March 18, 2024

I hope for the day such tool will convert Oracle PL/SQL to Postgres

fljdin · on March 18, 2024

This is one of the major challenges the tool aims to meet! We hope to achieve reliable results this year.

https://gitlab.com/dalibo/transqlate/-/issues/20

pjmlp · on March 19, 2024

Given how many SQL features are RDMS specific, I doubt the claim of being able to convert any SQL into PostgreSQL dialect.

chaps · on March 18, 2024

Nice. Anyone know if similar exists going the opposite direction? Specifically thinking of postgresql -> oracle sql.

pmarreck · on March 18, 2024

This is interesting.

Is there any standardized AST for SQL?

jerf · on March 18, 2024

The SQL standard nominally defines one (I don't know if it hands it to you on a silver platter as some sort of grammar but it certainly defines one one way or another), but if you just implement that and then send your code out into the world, you're going to be disappointed with the results. All the database have deviations and extensions of their own and people use them frequently.

fljdin · on March 18, 2024

A CLI tool to transpile SQL snippet from a dialect to another using an AST

quantxx · on March 18, 2024

Interesting

throwitaway222 · on March 18, 2024

[flagged]

ramraj07 · on March 18, 2024

No one trusts these LLMs enough to plug sql that comes out of it directly into a compiler.