-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate "Prelude" of standard C types #293
Comments
Directory
Do we want to always/only use that directory, or might we want to run the command using different include directories? Would it be worthwhile to either always inject that include directory or expose our own |
I get a fatal error that What should we do? |
Assuming that we want to update the vendored Musl headers as new versions are released, I would like to make doing so trivial by using a script. Would it be acceptable to add a Bash script to do this named I would like to confirm: our vendored Musl header files include the How were the current Musl headers vendored? Were they copied from a release tarball, for example? |
No, they include x86_64 build. I fail to understand how updating musl is related to generating prelude of standard C types. |
IMHO, we should support |
I am working on development command
Thank you! I did a build of Musl 1.2.5 targeting Verbose notes: Building Musl 1.2.5:
Comparing the built include directory with the source:
To install/upgrade our vendored Musl headers, it looks like we cannot just copy headers from the tarball. We should copy the To always target |
Regarding compatibility, I found the compatibility wiki page with links to C99 API coverage and C11 API coverage. In addition to |
I would like to confirm how we will implement our "prelude" of C types. There are a lot of types. In C, many types are available from multiple header files. In Musl, such types are defined in
Perhaps we should use the Many of the types are platform-specific. Should we generate Haskell definitions that match the current platform? If so, I wonder what is the best way to do this. If we also want to enable/disable features based on availability (when using GNU vs. Musl for example), perhaps we should use Autoconf? Each type should have a newtype CInt8 = CInt8 CSChar
deriving newtype (
Bits
, Bounded
, Enum
, Eq
, FiniteBits
, Integral
, Ix
, Num
, Ord
, Read
, Real
, Show
, Storable
) |
These types have no instances. Comments indicate that What should we do about such types in our prelude? Perhaps we should define our own versions of these types and not use those in For example, we could implement (our own version of) newtype CFile = CFile CChar
deriving newtype (Eq, Storable) Types typedef union _G_fpos64_t {
char __opaque[16];
long long __lldata;
double __align;
} fpos_t; Perhaps in practice we will only ever deal with pointers to such types anyway. This is probably why they are defined like that in We will need to test/use the types to gain confidence in the design. |
We are translating C typedef struct __jmp_buf_tag {...} jmp_buf[1];
typedef jmp_buf sigjmp_buf; In such a case, perhaps both should be defined using data CJmpBuf = CJmpBuf
data CSigJmpBuf = CSigJmpBuf |
Musl supports multiple standards, including POSIX and BSD. Different standards provide different declarations in the same header files, so the declarations exposed by a header file depend on which standards are enabled. For example, standard header Here are the relevant preprocessor definitions:
When none are defined by the user, What standards to we want to support in our prelude? In the implementation of the |
I tried configuring #define __STRICT_ANSI__ 1
#if defined(_ALL_SOURCE)
#undef _ALL_SOURCE
#endif
#if defined(_BSD_SOURCE)
#undef _BSD_SOURCE
#endif
#if defined(_DEFAULT_SOURCE)
#undef _DEFAULT_SOURCE
#endif
#if defined(_GNU_SOURCE)
#undef _GNU_SOURCE
#endif
#if defined(_POSIX_C_SOURCE)
#undef _POSIX_C_SOURCE
#endif
#if defined(_POSIX_SOURCE)
#undef _POSIX_SOURCE
#endif
#if defined(_XOPEN_SOURCE)
#undef _XOPEN_SOURCE
#endif As a test, I added Perhaps we can do this configuration via Clan arguments, but I am concerned that this is not working regardless. |
Clang option
I noticed that the option is stored in |
The I will definitely change this behavior, but for now I am simply removing the Inserting Alternatively, passing the My sanity is restored. Considering the options, my proposal is to always specify a C standard when running the |
Talking with @edsko yesterday, it indeed sounds like we will only support C standards in our standard library (currently "patterns"). Based on a previous discussion, all C code that we generate should conform to C17. I am now thinking about how to add a C standard option to the command-line executable. Should it be added to the top-level, alongside In the context of development commands such as What should the behavior be for the Should the option support older standards such as C11? EDIT: I went ahead and implemented it at the top level for now, as we already parse |
Oops, I apologize for causing harm to your sanity 😬 |
@sheaf mentioned that This is a good reminder, however, that one may get different results depending on the headers used. I was unable to find an elegant way to query the default Clang header search path, but it can be found in the output of the following command.
The GCC defaults can be found in the output of the following command.
On my system using the default Clang installation, the headers are in On my system using the default GCC installation, the headers are in When executing
It runs fine when specifying options as follows.
Inspecting the generated code, it is newtype Ptrdiff_t = Ptrdiff_t
{ unPtrdiff_t :: FC.CLong
} I am unable to investigate using For the
|
How should we handle this type? Note that it is used in the standard library. Example: typedef struct { long long __ll; long double __ld; } max_align_t; |
What naming conventions should we use for types defined in the prelude? The module re-exports types defined in Note that Alternatively, we can follow our own default naming conventions, though this results in non-Haskell-style names that are not consistent with the names of types exported from |
As decided on in our last call, I am implementing the prelude in module Architecture-dependent implementations are re-exported from internal module There will be a separate
I am currently exposing the -- | Design patterns for writing high-level FFI bindings
--
-- This is the only exported module in this library. It is intended to be
-- imported unqualified. Should we continue to do this even as the library grows? By the way, I see that FWIW, as a user, I would much prefer a separate One reason for exposing the |
Here is a GHC issue regarding The My understanding is that the implementation may differ in different libraries/compilers since the standard does not have exact specifications for the type, however, not just per architecture. Perhaps implementations are consistent in practice... The
Source#include <stdalign.h>
#include <stdio.h>
int main(void) {
printf("sizeof(float): % 2zu\n", sizeof(float));
printf("alignof(float): % 2zu\n", alignof(float));
printf("sizeof(double): % 2zu\n", sizeof(double));
printf("alignof(double): % 2zu\n", alignof(double));
printf("sizeof(long double): % 2zu\n", sizeof(long double));
printf("alignof(long double): % 2zu\n", alignof(long double));
return 0;
} As mentioned in the GHC issue, an alternative approach would be to define an opaque type and only support pointers. That would limit what we can support, though, which may be problematic. |
Initial PR: #347 |
In C89 (4.4 Localization), the following fields are defined: char *decimal_point; /* "." */
char *thousands_sep; /* "" */
char *grouping; /* "" */
char *int_curr_symbol; /* "" */
char *currency_symbol; /* "" */
char *mon_decimal_point; /* "" */
char *mon_thousands_sep; /* "" */
char *mon_grouping; /* "" */
char *positive_sign; /* "" */
char *negative_sign; /* "" */
char int_frac_digits; /* CHAR_MAX */
char frac_digits; /* CHAR_MAX */
char p_cs_precedes; /* CHAR_MAX */
char p_sep_by_space; /* CHAR_MAX */
char n_cs_precedes; /* CHAR_MAX */
char n_sep_by_space; /* CHAR_MAX */
char p_sign_posn; /* CHAR_MAX */
char n_sign_posn; /* CHAR_MAX */ In C99 (7.11 Localization), the following fields were added: char int_p_cs_precedes; // CHAR_MAX
char int_n_cs_precedes; // CHAR_MAX
char int_p_sep_by_space; // CHAR_MAX
char int_n_sep_by_space; // CHAR_MAX
char int_p_sign_posn; // CHAR_MAX
char int_n_sign_posn; // CHAR_MAX Musl defines the structure with all of these fields, compliant with C99 and later. What should we do in this case? Perhaps one option is to implement the versions as separate types and make the C standard a factor when resolving types. |
In general, I wonder how we will resolve Perhaps a first step is mapping C names to We cannot simply compare the types. For example, we define We can generate code for types that differ. For example, a C89 version of |
Brief summary discussion with @TravisCardwell on naming conventions:
|
Discussing in chat, there are changes to what we will put in the We decided to not re-export types from We decided to not create |
I am working on using From C99 section 7.23.1 paragraph 4:
The first issue is that different implementations may add fields as long as they keep the standard ones. We cannot implement a corresponding Haskell type in If we were to only support specific C library implementations, we could provide a Haskell type if the types across supported implementations are compatible. Compatibility would need to be across architectures and versions, so I think we would need extensive tests. In this case, Musl adds two extra fields related to time zones, each prefixed with two underscores. Glibc adds the same extra fields, optionally prefixed with two underscores according to a preprocessor flag, but the possible difference in field names does not matter to us. The second issue is that the order is not specified. The order of the fields determines how they are laid out in memory, determining the offsets that we need in Even if we only support specific library implementations for which the fields and field order is fixed, the type size, alignment, and field offsets are all system-dependent. Since we have As a concrete example, here is the // Offset Size Alignment Padding
struct tm { // 56 8
int tm_sec; // 0 4 4
int tm_min; // 4 4 4
int tm_hour; // 8 4 4
int tm_mday; // 12 4 4
int tm_mon; // 16 4 4
int tm_year; // 20 4 4
int tm_wday; // 24 4 4
int tm_yday; // 28 4 4
int tm_isdst; // 32 4 4
// 4
long __tm_gmtoff; // 40 8 8
const char *__tm_zone; // 48 8 8
}; I tentatively implemented a Template Haskell function layoutRecord ::
[(Size, Alignment)]
-- ^ Size and alignment of each field
-> (Size, Alignment, [Offset])
-- ^ Type size, alignment, and field offsets
layoutRecord = aux 0 0 []
where
aux ::
Offset
-> Alignment
-> [Offset]
-> [(Size, Alignment)]
-> (Size, Alignment, [Offset])
aux offset alignment acc = \case
(fieldSize, fieldAlignment) : ps ->
let fieldOffset = case offset `mod` fieldAlignment of
0 -> offset
w -> offset - w + fieldAlignment
in aux
(fieldOffset + fieldSize)
(max alignment fieldAlignment)
(fieldOffset : acc)
ps
[] -> (offset, alignment, reverse acc) Is this correct/safe across different architectures? Would it be preferable to just use C ( |
In the repository there is a file
standard_headers.h
which imports all of the C standard library. We also have a skeleton implementation of some code (bootstrapPrelude
) that trawls through these definitions; currently this is only being used to check if the macro parser is failing to parse some macros that we should be parsing (just to have a source of examples). We should extend this to make a list of all standard types in the standard library, and ensure that for all of these standard C types we have a well-defined translation to standard Haskell types (either frombase
or fromhs-bindgen-patterns
).The text was updated successfully, but these errors were encountered: