C, Fortran, and single-character strings

ryl00 · on June 22, 2019

I work on a large, multi-platform, legacy codebase of C/C++ and Fortran, and the FFI between the two sides has always been an area where care must be taken to ensure proper interop.

Being multi-platform, multi-compiler through the years has helped us ferret out FFI problems like the one described here, as our Windows Fortran compiler used to be Compaq Visual Fortran, which interleaved character strings with their length argument (as opposed to gfortran, which puts all such string lengths at the very end of the argument list). With Compaq Visual Fortran (and Intel Fortran with the right options set), forgetting to specify a string length argument on the C side would almost invariably lead to immediate crashes (unless, of course, the sole string argument was at the end).

ChickeNES · on June 22, 2019

Compaq had their own Fortran compiler?

EvanAnderson · on June 22, 2019

It was DEC's and came to Compaq by way of the acquisition.

wumpus · on June 21, 2019

The discussion is probably pretty confusing for most folks not familiar with Fortran.

Fortran compilers can set their own ABI and usually do for everything after Fortran 77. For Fortran 77 and earlier, which includes CHARACTER* 1, most compilers use the same ABI that the Bell Labs f77 preprocessor used, which lives on as f2c in the modern era. This compatbility is common enough that many packages written in C that want to be called from Fortran have this ABI directly embedded in their source code.

The problem is that the current gcc gfortran front-end doesn't do this for CHARACTER* 1 strings, and there's C code expecting that extra string length argument. And if they want to change it, any object code with CHARACTER* 1 arguments in it needs to be recompiled.

AnssiH · on June 22, 2019

> The problem is that the current gcc gfortran front-end doesn't do this for CHARACTER* 1 strings, and there's C code expecting that extra string length argument.

Is it? My reading of the article is that gcc gfortran expects the length argument for CHARACTER* 1 to be provided as usual, but there is a lot of C code that does not provide that when calling Fortran functions.

acqq · on June 22, 2019

That's how I read it too. The existing C code which is not passing the expected parameter was a result of "hey I've tried it without and it worked" approach. It "worked" at the moment, and only for that specific compiler, but once the compiler started to rely on the "fact" that the parameter is "by definition there" some the previously "working" programs started to break.

And this outcome is in fact nothing special to C and Fortran. I can just as well write a shell script (or any other) which "expects" that there is a parameter and if I don't use it and I later start to depend on it I'm sure there will be some other scripts calling that one and passing something else than what was "defined" as expected to be passed.

On that topic, erverybody should read ryl00's comment here. Having more than one really different target is the only practically effective path to immunize from these kinds of dependencies. For me, it's one of good arguments against the "monoculture" software targets.

malaxii · on June 22, 2019

Right, gfortran does not special-case length-1 strings.

The actual problem however is that it still works in simple cases --- when the Fortran code doesn't access the string length (and apparently, doesn't do tail-calls) --- even when the extra integer length arguments are omitted from the call. Because of this it was possible for the mistake to spread widely...

jabl · on June 22, 2019

Exactly this. And no, without breaking standard conforming Fortran code it's not possible to special case CHARACTER(len=1) arguments.

lilyball · on June 21, 2019

If the breakage is ultimately caused by compilers like GCC omitting the length for single-character strings, why doesn't this discuss the fix of simply changing GCC to start putting the length there for single-character strings?

0xffff2 · on June 21, 2019

I've never used FORTRAN, so I'm just going off of my reading of the article. It looks to me like the correct requirement is that the C source code itself include the string length, and that developer have not been doing so.

lilyball · on June 21, 2019

Oh I see. I was thinking that calls to Fortran functions were actually recognizable by the compiler, but you're right, the code snippet right at the top shows a manually-inserted strlen(s) in the C code.

kazinator · on June 22, 2019

Hmm. Unless my understanding is off, these tail calls will work right if there happens to be a word in the stack where the missing argument is supposed to be, and that word has a value of 1 (correct string length). Because that looks indistinguishable from the correct argument having been passed.

So that suggests a run-time solution like this:

1. The function examines the string length argument word. If that word is 1, everything is cool; the function proceeds, the helper function is tail called and so forth.

2. If, on entry, that word is not 1, then the function calls itself (with a real non-tail call that allocates new argument). It passes itself the missing 1 value properly. When that call nested returns, it also returns.

3. This nested invocation issued in (2) now sees a value of 1 in that argument since it is correctly passed, and so it proceeds as given in step (1). When all that tail-call-ology finally executes a proper return somewhere, it will return to this nested invocation of that function, which will pop out, and return to the C caller, as described in (2).

No need for a compiler switch to turn off tail calls. Tail calls work among functions that obey the ABI (like all Fortran-Fortran calls). When an ABI violation is detected, then we get an extra frame. (Hopefully there aren't tail calling loops that involve cycling between Fortran and broken C.)

The broken C can be gradually fixed. The workaround can be deprecated and removed when that happens. Correct code doesn't trigger the workaround behavior in the invoked functions and also doesn't suffer the performance hit of the extra nested call. That creates an incentive to fix the C code.

comex · on June 22, 2019

I’m not sure that doing that check in every function that might be called from C would actually improve performance compared to just turning off tail calls.

kazinator · on June 22, 2019

Well, not every function that might be called from C; every function which takes this funny string argument that is known to be of length 1 that might be called from C.

uxp100 · on June 21, 2019

The article quoted someone saying "OUCH. So, basically, people have been depending on C undefined behavior for ages..."

This doesn't actually have to do with c undefined behavior at all right? I think I get what is happening but that comment is throwing me off.

quietbritishjim · on June 21, 2019

As I understand it, some C programs omit the final argument to a function. That function does not actually read the value of the argument, but it is still undefined behaviour for it not to be passed in.

wahern · on June 22, 2019

Yes and no. This is great example of why some of the rhetoric about undefined behavior in C is misleading.

This is fundamentally an FFI issue. The C compiler is relying on the function signature declared, explicitly or implicitly, by the person writing the binding code. But this declared signature is wrong, and because it's wrong C says that the behavior when invoking the routine is undefined--undefined behavior.

Other languages would have the exact same result. If you told Rust that the FFI prototype for a FORTRAN routine was foo(pointer-to-char) instead of foo(pointer-to-char, size_t), you'd get the exact same broken runtime behavior as with the C code. But Rust never defines this as "undefined behavior". Rust's behavior is no less undefined in the practical sense, it just don't formalize the concept in its specification, such as it is. Like most languages it side steps the issue because it's a thorny area for a specification, especially thorny now that the literal phrase "undefined behavior" elicits reactionary flames from the peanut gallery. If you say nothing you're not inviting criticism, nor are you burdened with managing yet another dimension of consistency in your specification.

There are problems with "undefined behavior" in C, how it works in the standard and especially how compilers use it to shift blame for insecure semantics. The problems are just nuanced and technical and can't be constructively discussed outside specific contexts. And especially in discussions comparing C to different languages, these problems get conflated with orthogonal issues like type safety.

EDIT: For context here are some relevant citations to C11 itself:

C11 (N1570) 3.4.3p1: "undefined behavior [is] behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements"

C11 (N1570 3.4.3p2: "NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message)."

C11 (N1570) 4p2: "If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a constraint or runtime-constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words ‘‘undefined behavior’’ or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe ‘‘behavior that is undefined’’."

uxp100 · on June 22, 2019

I guess my confusion is if writing a signature for a FORTRAN function wrong is undefined behavior.

I wasn't thinking that having a function prototype incompatible with the external fortran code would be undefined, just, uh, wrong in a normal sort of way.

shakna · on June 22, 2019

How could it be wrong in a 'normal sort of way', though?

The type signature is the rule of law that allows you to normally pass safe pieces of memory between two systems. If you define it badly, what exactly is supposed to catch this and tell you that it's wrong?

Is the compiler supposed to speak both languages, and be able to disassemble whatever object code it's given when linking to attempt to determine if the signature is valid?

uxp100 · on June 22, 2019

By generating c code that correctly sets memory to call a function with a particular ABI with the given signature. Then it's the normal sort of wrong.

Since the spec doesn't speak about FFI at all it seems, I think this is implementation defined behavior, not undefined behavior.

shakna · on June 22, 2019

> By generating c code that correctly sets memory to call a function with a particular ABI with the given signature.

And it does. The signature is wrong, however, and so the resulting call will be wrong.

Since the standard doesn't speak about FFI, and the invalid call is actually in another language's memory, I'd say we're in completely undefined territory here.

To try and say that more clearly, I'd still call it undefined behaviour if an invalid type signature in Java calling Go resulted in something going wrong. Java can't know how Go is supposed to react to a fudge in it's internal memory semantics.

uxp100 · on June 22, 2019

Ok, I understand what you mean. In the c spec undefined behavior has a specific meaning that I don’t think applies to this case.

perl4ever · on June 22, 2019

"that the literal phrase "undefined behavior" elicits reactionary flames from the peanut gallery"

The issue I have with "undefined behavior" is that defining undefined behavior is like dividing by zero. It's like saying you have proven the efficacy of the placebo effect. Once you start believing in contradictions, you are on the road to hell.

There is an immeasurable difference between what you say Rust does, leaving undefined behavior undefined, and trying to have it both ways.

wahern · on June 22, 2019

There's exactly 1 implementation of Rust. Rust is whatever HEAD in the Rust repository implements.

C was standardized nearly 20 years after the first C compiler. The original C standard and its revisions were written in a way that (with some exceptions and caveats) effectively christened existing implementations and codebases as "C", just not necessarily in their entirety. The guiding principle has always been to iteratively formalize and evolve existing practice. If a standard were to effectively tell any compiler or project that wasn't line-for-line "conformant" or "correct" or "strictly conformant" to GTFO, the C standard would have failed in bringing order to chaos. The use of "undefined behavior" was and remains a mechanisms for achieving that goal. C isn't trying to "have it both ways"; it's addressing real, substantial dilemmas within a huge software ecosystem.

C's approach is somewhat unique and idiosyncratic, but rooted in a historical context that cannot be ignored. And it's success is undeniable. Some languages, like Rust, with only a single well-accepted implementation have the luxury of staying silent. Other languages (e.g. Ada, once upon a time) with diverse, equally relevant implementations that use a heavier hand in directing the evolution have a poor track record. You can't force a vendor to implement some behavior if they don't want to; and if you attempt to coerce them they're liable to walk away. C++ is another somewhat unique case, but while there remains substantial diversity of C compilers that are as a practical matter functional C11 implementations (C11 made alot of C99 optional), there are now effectively only two relevant C++ compilers: GCC and clang. (Maybe Visual Studio, we'll see.)

Step back and consider why C remains so popular and successful despite its obvious and well acknowledged faults. People think it's an accident of history that C has been so influential, but I think that's self-serving. It's much easier to say that than to acknowledge that the evolution of C reflects an uncomfortable reality in the technology industry regarding the forces that lead to success and failure. Node.js and PHP reflect some similar uncomfortable truths. C's concept of "undefined behavior" is an attempt to cabin and grapple with some of those realities in an honest and more intellectually rigorous manner. That's why it's so easy to pick on "undefined behavior" in C; its whole function is to help outline the contours of and acknowledge some very deep pathologies rather than ignoring them or pretending they don't exist. Rust's unsafe{} is loosely similar, but you can't directly compare the two because both the historical and contemporary contexts are different. Again, there's precisely 1 Rust implementation; it doesn't need to formalize undefined behavior in a way that permits reconciling diverse behaviors (behaviors that are often justifiably relied upon) and providing a path to better, safer, unified semantics. Rust developers can ignore various matters of undefined behavior until one day it can dictate the One True Way, without risk of splitting the community or making itself irrelevant.

pjmlp · on June 22, 2019

There are plenty of relevant C and C++ compilers.

The computing world isn't constrained to Linux and BSD clones running on the server room or mobile phones.

moefh · on June 22, 2019

> The use of "undefined behavior" was and remains a mechanisms for achieving that goal.

Is that really true? Couldn't the same goal be achieved by using implementation-defined behavior instead?

I thought C keeps undefined behavior because it leaves the door open for optimizations. Otherwise (for example) int overflow could be left as implementation-defined behavior: almost everyone gets the expected behavior (two's complement wrap-around) and the 0.01% unlucky enough to be using a really old and weird processor gets some other behavior (one's complement wrap-around, traps or who knows what other kinds of processors have ever existed).

The reason why this is not done, in my understanding, is that int overflow (among other stuff) being left undefined means the compiler has more room for optimization.

ncmncm · on June 23, 2019

"Implementation-defined" creates a requirement on implementors to document a definition on physical paper.

An implementer, though, may write down, "The behavior of this construct is undefined", and they have discharged that responsibility.

Undefined really does mean not defined -- but, not defined in the Standard. Anybody is free to define anything the Standard doesn't, and a program may rely on that. #include <unistd.h> is UB, but your compiler uses the definition provided by the (rather, a) separate Posix Standard.

Implementations do take advantage of explicit UB for optimization opportunities, but that is not its purpose.

ChickeNES · on June 22, 2019

> There's exactly 1 implementation of Rust. Rust is whatever HEAD in the Rust repository implements.

That's not true: https://github.com/thepowersgang/mrustc

ncmncm · on June 23, 2019

I asked about that. It doesn't compile Rust, but only the exact subset and dialect of Rust that was used to code exactly one program: the Rust 1.19 compiler.

In other words, it's not properly a compiler, as such, but a bootstrapping tool. It does important compiler-esque things, but not others, and is useful to no one else.

Someone · on June 22, 2019

One could argue that’s not a rust compiler because it doesn’t have rust’s most-defining feature: the borrow checker (”mrustc works by compiling assumed-valid rust code (i.e. without borrow checking)”)

kazinator · on June 22, 2019

ISO C specifies no requirements for interacting with Fortran. Every aspect of this is "C undefined behavior". Requirements are violated, to be sure, but not ones coming from ISO C.

uxp100 · on June 22, 2019

That doesn't sound like undefined behavior to me. Or at least a different definition than the normal one.

kazinator · on June 22, 2019

"C undefined behavior" is the status of the behavior of a C program for which a requirement cannot be found in ISO C by omission, or for which ISO C explicitly says that it's not giving a requirement. No requirement can be found in ISO C for linkage with Fortran.

There is another "undefined behavior" which is the informal situation that happens when a behavior is not defined by anyone at all: not ISO C, not the platform's ABI or API's, not the compiler reference manual. (That is hard to pin down, because what does it mean? If I compile a program, it's always defined by whatever the object code is doing. We can interpret the instructions through the architecture reference manual, and if we don't find a problem then, hey, it's defined! (However, the compiler's documentation doesn't guarantee the consistency of always producing that particular compiler output for that program.)

In any case, that Bugzilla comment specifically said "C undefined behavior".

verisimilitudes · on June 21, 2019

I'm horribly amused by this article. You constantly see C programmers working around that mess of a language like this and so constantly see these glaring flaws appear as if from the ether, almost as if it's a natural consequence of programming.

Meanwhile, Ada has Interfaces.Fortran for actually interfacing with the language and doing it correctly. Fools will tell you C is the backbone of civilization and everything interfaces with it, but this is yet another example of how they're wrong.

anfilt · on June 22, 2019

This is not a language issue, but programmers who have made a mistake. The same thing would happen if an assembly programmer were not to set the correct registers or push all the arguments to the stack before calling an other function. The same thing would happen in most languages if a programmer declared their FFI incorrectly.

So Programmer/s have written function prototypes for the Fortran functions, but they have omitted the length parameter in that prototype. That is the reason behind this. It should have never worked to begin with but the way it was handled on the FORTRAN side things meant for single length strings it did not crash. GCC probably could have just made the change they wanted, but they don't want to break existing code right away that is technically already broken.

It boils down to some programmer/s doing the following.

int foo(char str);

instead of

int foo(char str, int len); // The correct prototype.

The prototype is you saying there is a function that exists with this name, return type and parameters. The compiler will then construct calls with what you have defined as the function prototype. Then when linking the linker just looks at the name of the called function and replaces it with the address/memory position of the function.

verisimilitudes · on June 22, 2019

>This is not a language issue, but programmers who have made a mistake.

That's always the excuse with C, I know.

>The same thing would happen in most languages if a programmer declared their FFI incorrectly.

If the Ada Interfaces.Fortran didn't work, that would be a compiler issue, but that's only because Ada does things right.

>It boils down to some programmer/s doing the following.

I read the article, you know.

anfilt · on June 22, 2019

You can do the same thing in ada. Instead of using the Fortran_Character type defined in Interfaces.Fortran you could just define your own that does not follow Fortran's string calling convention.

Is that a language problem with ada? That's all the C programmers did here.

verisimilitudes · on June 22, 2019

>You can do the same thing in ada.

I wouldn't know how GNAT behaves in this case, but I figure it would at the least be much harder to do by mistake, as has been done with C for years.

>Instead of using the Fortran_Character type defined in Interfaces.Fortran you could just define your own that does not follow Fortran's string calling convention.

The Ada compiler is aware of what's being done when you interface with another language. You must tell it how the procedure is from a different language and you're also able to specify the representation of Ada data types to conform. I wouldn't be surprised to learn that GNAT is capable of checking on the Fortran side of things as well, for interfacing like this.

>Is that a language problem with ada? That's all the C programmers did here.

I don't believe Ada has this issue, but I'd wager it at the least makes it harder to make. In any case, I have no personal experience interfacing Ada with Fortran.

My point, however, is that this is a very common issue with C programming, where people don't know what they're doing and things break constantly. The C language is not designed for interfacing with other languages, in part because it wasn't designed much at all.

rightbyte · on June 22, 2019

Calling conventions are ... conventions. What language magically solves inter architecture ABIs?

icedchai · on June 23, 2019

Given how many other languages interface with C (basically, all of them), this statement is a bit silly and clearly false.