• Examples of string functions in C. Input and output of character strings in C

    Declaring Strings

    A string in the C language is a one-dimensional array of characters, the last element of which is the end-of-line character - zero (a string terminated by zero, that is, a NULL terminated string).

    Announcement variable type a string in C is possible in three ways, two of which initialize the string at declaration time.

    First way:

    Character array declarations (remember to add space for the terminating null):

    Char s;

    Second way:

    Assign an initial value to the string variable (in this case, the compiler can calculate the length of the string itself):

    Char s = "Example of string initialization";

    To the right of the assignment sign is a string constant. A zero ('\0') is automatically added to the end of the line. Character string constants are placed in a static memory class.

    Third way:

    Implicit indication that an array is being used. To the left of the assignment sign is a pointer to the symbol:

    Char *s="Second initialization option";

    The variable s will be a pointer to that location in RAM, where the string constant is located. A potential pitfall in this form of notation is that a pointer to a character is often referred to as a string. The entry below is only a pointer to a character, since there is no space provided for the string:

    Char *s;

    Entering a string from a standard input device (keyboard)

    There is a set of functions for working with strings. For input from a standard input device (keyboard), library functions from the standard input/output module are most often used: scanf And gets.

    To enter a string using the function scanf, uses the format « %s» , and note that the address sign is not used before the line identifier « & » , since a one-dimensional array is already represented by a pointer to its beginning:

    Scanf("%s", s);

    Function gets() reads characters until it reaches a transition character new line. The function accepts all characters up to the newline character, but does not include it. A terminating zero ('\0') is added to the end of the line. Function gets() places the sequence of characters read from the keyboard into a parameter of type string and returns a pointer to this string (if the operation was successful), or NULL (in case of an error). In the example below, if the operation completes successfully, two identical lines will be displayed on the screen:

    #include int main() ( char s; char *p; p=gets(s); printf(" \n The string %s was entered.",s); if (p) printf(" \n The string %s was entered.", p); return 0; )

    In passing, note that the gets function is often used to enter any data from the keyboard as a string for further conversion by the sscanf function to required format or for preliminary analysis of input data, for example:

    #include #include #include int main() ( char s; int x, err; do ( printf(" \n Enter an integer -> "); gets(s); err=sscanf(s, "%d",&x); if (err !=1) printf(" \n Input error. " ) while (err!=1); printf("\n Integer entered -> %d", x )

    Printing strings to standard output (monitor screen)

    To output strings to standard device output (monitor screen) two functions can be used printf And puts. The printf function is passed "%s" as the format. The convenience of using this function is that, in addition to a string, you can immediately display data of other types. Feature Feature puts is that after a line is printed, it automatically moves to the next line.

    Functions for working with strings

    To convert strings in the C language, the string library is provided. Each function has its own recording format (prototype).

    The most used functions are discussed in this article. - read

    Example of programs (listing) working with strings

    It is no coincidence that I placed the topic about strings in the “Arrays” section. Since a string is essentially an array of characters. Here's an example:

    char str = "This is just a string";

    For greater understanding, the same line can be written like this:

    char str = ("E","t","o"," ","p","r","o","s","t","o","","s", "t", "r", "o", "k", "a");

    Those. still the same array, only consisting of characters. Therefore, you can work with it, just like with integer arrays.

    Now let's try work with strings in c. In the introductory lessons, we learned that symbols belong to integer types, i.e. each character has its own numerical value. Here is an example and its solution:

    1. you need to convert the entered word to uppercase:
    2. #include
      #include

      Int main()
      {
      char str = "sergey";

      str[i] -= 32;
      }
      for (int i=0; str[i] != "\0";i++)(
      printf("%c", str[i]);
      }
      getch();

      Return 0;
      }

      To get the code of a number, simply use the %d specifier in the printf function. Yes, and one more important point: ending any lines is a null terminator, which is denoted special character - "\0".

    Another way to specify a string is to declare it using char*. Here's an example:

    char *str = "wire";

    Those. a pointer to a string is created and located somewhere in memory.

    And here’s how you can enter strings through the scanf operator, which is already familiar to us:

    char str; scanf("%s", str);

    There are two subtleties here:

    1. the address taking sign is not needed here, since the name of the array, as we already know, is the address
    2. The length of the input string should not exceed 15 characters, since the last must be a null terminator. Moreover, the compiler itself will fill in this symbol after your last entered symbol.

    Since the C language is a structural language, there are already built-in functions for working with strings and with symbols. To process strings you will need to include the file: ctype.h. The file contains functions for determining case and character format. Basically, everything you might need to know about a character can be done using the functions in the ctype.h file

    Sometimes you may need to convert a string to another data type. To convert strings to other types, there is the stdlib library. Here are its functions:

    1. int atoi (char *str)
    2. long atol (char *str)
    3. double atof (char *str)

    Sometimes these functions are very helpful, for example, when you need to extract the year or digital value from a string. Working with strings in c (si) is a very important topic, so try to understand this lesson.

    Lines. String input/output. Formatted I/O. String processing using standard C language functions. Working with memory.

    1.1. Declaration and initialization of strings.

    A string is an array of characters that ends with the empty character '\0'. The string is declared as a regular character array, for example,

    char s1; // string nine characters long

    char *s2; // pointer to string

    The difference between pointers s1 and s2 is that pointer s1 is a named constant, and pointer s2 is a variable.

    String constants are enclosed in double quotes, unlike characters, which are enclosed in single quotes. For example,

    “This is a string.”

    Length string constant cannot exceed 509 characters according to the standard. However, many implementations allow longer string lengths.

    When initializing strings, it is better not to specify the array size; the compiler will do this by calculating the length of the string and adding one to it. For example,

    char s1 = “This is a string.”;

    In the C programming language, there is a way to work with strings. large number functions whose prototypes are described in the stdlib.h and string.h header files. Working with these functions will be discussed in the following paragraphs.

    1.2. String input/output.

    To enter a string from the console, use the function

    char* gets(char *str);

    which writes a string to the address str and returns the address of the entered string. The function stops input if it encounters a ‘\n’ or EOF (end of file) character. The newline character is not copied. A zero byte is placed at the end of the read line. If successful, the function returns a pointer to the line read, and if unsuccessful, NULL.

    To output a string to the console, use the standard function

    int puts (const char *s);

    which, if successful, returns a non-negative number, and if unsuccessful, returns EOF.

    The gets and puts function prototypes are described in the stdio.h header file.

    #include

    printf("Input String: ");

    1.3. Formatted I/O.

    For formatted data input from the console, use the function

    int scanf (const char *format, ...);

    which, if successful, returns the number of units of data read, and if unsuccessful, returns EOF. The format parameter must point to the string to be formatted, which contains the input format specifications. The number and types of arguments that follow the format string must match the number and types of input formats specified in the format string. If this condition is not met, then the result of the function is unpredictable.

    The space, "\t" or "\n" characters in the format string describe one or more empty characters in the input stream, which include the following characters: space, '\t', '\n', '\v', '\f '. scanf function skips empty characters in the input stream.

    Literal characters in a format string, with the exception of the % character, require that exactly the same characters appear in the input stream. If there is no such character, the scanf function stops entering. The scanf function skips literal characters.

    In general, the input format specification looks like this:

    %[*] [width] [modifiers] type

    The symbol '*' denotes omission when entering a field defined by this specification;

    - ‘width’ defines the maximum number of characters entered according to this specification;

    The type can take the following values:

    c – character array,

    s – a string of characters, lines are separated by empty characters,

    d – signed integer of 10 s/s,

    i is a signed integer, the number system depends on the first two digits,

    u – unsigned integer at 10 s/s,

    o – unsigned integer in 8 s/s,

    x, X – unsigned integer at 16 s/s,

    e, E, f, g, G – floating number,

    p – pointer to pointer,

    n – pointer to an integer,

    […] – an array of scanned characters, for example, .

    In the latter case, only characters enclosed in square brackets will be entered from the input stream. If the first character is inside square brackets is equal to ‘^’, then only those characters that are not included in the array are entered. The range of characters in the array is specified using the '-' symbol. When you enter characters, the leading blank characters and the trailing null byte of the string are also entered.

    Modifiers can take the following values:

    h – short integer,

    l, L – long integer or floating,

    and are used only for integer or floating numbers.

    The following example shows uses of the scanf function. Note that the format specifier, starting with the floating number input, is preceded by a space character.

    #include

    printf("Input an integer: ");

    scanf("%d", &n);

    printf("Input a double: ");

    scanf(" %lf", &d);

    printf("Input a char: ");

    scanf(" %c", &c);

    printf("Input a string: ");

    scanf(" %s", &s);

    Note that in this program the floating point number is initialized. This is done so that the compiler includes the library to support working with floating numbers. If this is not done, an error will occur at runtime when entering a floating number.

    For formatted output of data to the console, use the function

    int printf (const char *format, ...);

    which, if successful, returns the number of units of data output, and if unsuccessful, returns EOF. The format parameter is a format string that contains specifications for output formats. The number and types of arguments that follow the format string must match the number and types of output format specifications specified in the format string. In general, the output format specification looks like this:

    %[flags] [width] [.precision] [modifiers] type

    - ‘flags’ are various symbols that specify the output format;

    - ‘width’ defines the minimum number of characters output according to this specification;

    - ‘.precision’ defines the maximum number of characters displayed;

    - ‘modifiers’ specify the type of arguments;

    - 'type' specifies the type of the argument.

    To output signed integers, the following output format is used:

    %[-] [+ | space] [width] [l] d

    - – alignment left, default – right;

    + – the ‘+’ sign is displayed, note that for negative numbers the '-' sign is always displayed;

    ‘space’ – a space is displayed at the character position;

    d – int data type.

    To output unsigned integers, the following output format is used:

    %[-] [#] [width] [l]

    # – initial 0 is output for numbers in 8 c/c or initial 0x or 0X for numbers in 16 c/c,

    l – long data type modifier;

    u – integer in 10c/c,

    o – integer in 8 c/c,

    x, X – integer at 16 c/c.

    To output floating point numbers, the following output format is used:

    %[-] [+ | space] [width] [.precision]

    "precision" - indicates the number of digits after the decimal point for formats f, e and E or the number of significant digits for formats g and G. Numbers are rounded off. The default precision is six decimal digits;

    f – fixed point number,

    e – a number in exponential form, the exponent is denoted by the letter “e”,

    E – a number in exponential form, the exponent is denoted by the letter “E”,

    g – the shortest of the f or g formats,

    G – the shortest of the f or G formats.

    printf("n = %d\n f = %f\n e = %e\n E = %E\n f = %.2f", -123, 12.34, 12.34, 12.34, 12.34);

    // prints: n = 123 f = 12.340000 e = 1.234000e+001 E = 1.234000E+001 f = 12.34

    1.4. Formatting strings.

    There are variants of the scanf and printf functions that are designed to format strings and are called sscanf and sprintf, respectively.

    int sscanf (const char *str, const char *format, ...);

    reads data from the string specified by str, according to the format string specified by format. If successful, returns the number of data read, and if unsuccessful, returns EOF. For example,

    #include

    char str = "a 10 1.2 String No input";

    sscanf(str, "%c %d %lf %s", &c, &n, &d, s);

    printf("%c\n", c); // prints: a

    printf("%d\n", n); // prints: 10

    printf("%f\n", d); // prints: 1.200000

    printf("%s\n", s); // prints: String

    int sprintf (char *buffer, const char *format, ...);

    formats the string in accordance with the format specified by the format parameter and writes the resulting result to the buffer character array. The function returns the number of characters written to the character array buffer, excluding the terminating null byte. For example,

    #include

    char str = "c = %c, n = %d, d = %f, s = %s";

    char s = "This is a string.";

    sprintf(buffer, str, c, n, d, s);

    printf("%s\n", buffer); // prints: c = c, n = 10, d = 1.200000, s = This is a string

    1.5. Convert strings to numeric data.

    Prototypes of functions for converting strings to numeric data are given in the stdlib.h header file, which must be included in the program.

    To convert a string to an integer, use the function

    int atoi (const char *str);

    char *str = “-123”;

    n = atoi(str); // n = -123

    To convert a string to a long integer, use the function

    long int atol (const char *str);

    which, if successful, returns the integer to which the string str is converted, and if unsuccessful, returns 0. For example,

    char *str = “-123”;

    n = atol(str); // n = -123

    To convert a string to a double number, use the function

    double atof(const char *str);

    which, in case of success, returns a floating number of type double, into which the string str is converted, and in case of failure, 0. For example,

    char *str = “-123.321”;

    n = atof(str); // n = -123.321

    The following functions perform similar functions to atoi, atol, atof, but provide more advanced functionality.

    long int strtol (const char *str, char **endptr, int base);

    converts the string str to a long int number, which it returns. The parameters of this function have the following purposes.

    If base is 0, then the conversion depends on the first two characters of str:

    If the first character is a number from 1 to 9, then the number is assumed to be represented in 10 c/c;

    If the first character is the digit 0 and the second character is a digit from 1 to 7, then the number is assumed to be represented in 8 c/c;

    If the first character is 0 and the second is 'X' or 'x', then the number is assumed to be represented in 16 c/c.

    If base is a number between 2 and 36, then that value is taken to be the base of the number system, and any character outside the number system stops converting. In base 11 to base 36 number systems, the symbols 'A' to 'Z' or 'a' to 'z' are used to represent digits.

    The value of the endptr argument is set by the strtol function. This value contains a pointer to the character that stopped the string str from being converted. The strtol function returns the converted number if successful, and 0 if unsuccessful. For example,

    n = strtol (“12a”, &p, 0);

    printf("n = %ld, %stop = %c, n, *p); // n = 12, stop = a

    n = strtol("012b", &p, 0);

    printf("n = %ld, %stop = %c, n, *p); // n = 10, stop = b

    n = strtol (“0x12z”, &p, 0);

    printf("n = %ld, %stop = %c, n, *p); // n = 18, stop = z

    n = strtol (“01117”, &p, 0);

    printf("n = %ld, %stop = %c, n, *p); // n = 7, stop = 7

    unsigned long int strtol (const char *str, char **endptr, int base);

    works similar to the strtol function, but converts the symbolic representation of a number to a number unsigned type long int.

    double strtod (const char *str, char **endptr);

    Converts the symbolic representation of a number to a double.

    All functions listed in this paragraph stop working when they encounter the first character that does not fit the format of the number in question.

    In addition, if the character value of a number exceeds the range of acceptable values ​​for the corresponding data type, then the functions atof, strtol, strtoul, strtod set the value of the errno variable to ERANGE. The errno variable and the ERANGE constant are defined in the math.h header file. In this case, the atof and strtod functions return the HUGE_VAL value, the strtol function returns the LONG_MAX or LONG_MIN value, and the strtoul function returns the ULONG_MAX value.

    The non-standard functions itoa, ltoa, utoa, ecvt, fcvt, and gcvt can be used to convert numeric data to character strings. But it is better to use the standard sprintf function for these purposes.

    1.6. Standard functions for working with strings.

    This section discusses functions for working with strings, the prototypes of which are described in the header file string.h.

    1. String comparison. The functions strcmp and strncmp are used to compare strings.

    int strcmp (const char *str1, const char *str2);

    lexicographically compares the strings str1, str2 and returns –1, 0 or 1 if str1 is respectively less than, equal to or greater than str2.

    int strncmp (const char *str1, const char *str2, size_t n);

    lexicographically compares at most the first n characters from the strings str1 and str2. The function returns -1, 0, or 1 if the first n characters from str1 are respectively less than, equal to, or greater than the first n characters from str2.

    // example of string comparison

    #include

    #include

    char str1 = "aa bb";

    char str2 = "aa aa";

    char str3 = "aa bb cc";

    printf("%d\n", strcmp(str1, str3)); // prints: -1

    printf("%d\n", strcmp(str1, str1)); // prints: -0

    printf("%d\n", strcmp(str1, str2)); // prints: 1

    printf("%d\n", strncmp(str1, str3, 5)); // prints: 0

    2. Copying lines. The strcpy and strncpy functions are used to copy strings.

    char *strcpy (char *str1, const char *str2);

    copies the string str2 to the string str1. The entire string str2 is copied, including the terminating null byte. The function returns a pointer to str1. If the lines overlap, the result is unpredictable.

    char *strncpy (char *str1, const char *str2, size_t n);

    copies n characters from string str2 to string str1. If str2 contains fewer than n characters, then the last zero byte is copied as many times as necessary to expand str2 to n characters. The function returns a pointer to the string str1.

    char str2 = "Copy string.";

    strcpy(str1, str2);

    printf(str1); // prints: Copy string.

    4. Connecting strings. The functions strcat and strncat are used to concatenate strings into one string.

    char* strcat (char *str1, const char *str2);

    appends string str2 to string str1, with the trailing zero byte of string str1 erased. The function returns a pointer to the string str1.

    char* strncat (char *str1, const char *str2, size_t n);

    appends n characters from string str2 to string str1, with the trailing zero byte of string str1 erased. The function returns a pointer to the string str1. if the length of the string str2 is less than n, then only the characters included in the string str2 are appended. After concatenating strings, a null byte is always added to str1. The function returns a pointer to the string str1.

    #include

    #include

    char str1 = "String";

    char str2 = "catenation";

    char str3 = "Yes No";

    strcat(str1, str2);

    printf("%s\n", str1); // prints: String catenation

    strncat(str1, str3, 3);

    printf("%s\n", str1); // prints: String catenation Yes

    5. Search for a character in a string. To search for a character in a string, use the functions strchr, strrchr, strspn, strcspn and strpbrk.

    char* strchr (const char *str, int c);

    searches for the first occurrence of the character specified by c in the string str. If successful, the function returns a pointer to the first character found, and if unsuccessful, NULL.

    char* strrchr (const char *str, int c);

    searches for the last occurrence of the character specified by c in the string str. If successful, the function returns a pointer to the last character found, and if unsuccessful, NULL.

    #include

    #include

    char str = "Char search";

    printf("%s\n", strchr(str, "r")); // prints: r search

    printf("%s\n", strrchr(str, "r")); // prints: rch

    size_t strspn (const char *str1, const char *str2);

    returns the index of the first character from str1 that is not in str2.

    size_t strcspn (const char *str1, const char *str2);

    returns the index of the first character from str1 that appears in str2.

    char str = "123 abc";

    printf ("n = %d\n", strspn (str, "321"); // prints: n = 3

    printf ("n = %d\n", strcspn (str, "cba"); // prints: n = 4

    char* strpbrk (const char *str1, const char *str2);

    finds the first character in the string str1 that is equal to one of the characters in the string str2. If successful, the function returns a pointer to this character, and if unsuccessful, NULL.

    char str = "123 abc";

    printf("%s\n", strpbrk(str, "bca")); // prints: abc

    6. String comparison. The strstr function is used to compare strings.

    char* strstr (const char *str1, const char *str2);

    finds the first occurrence of str2 (without the trailing null byte) in str1. If successful, the function returns a pointer to the found substring, and if unsuccessful, NULL. If the str1 pointer points to a zero-length string, then the function returns the str1 pointer.

    char str = "123 abc 456;

    printf ("%s\n", strstr (str, "abc"); // print: abc 456

    7. Parsing a string into tokens. The strtok function is used to parse a string into tokens.

    char* strtok (char *str1, const char *str2);

    returns a pointer to the next token (word) in the string str1, in which the token delimiters are characters from the string str2. If there are no more tokens, the function returns NULL. On the first call to the strtok function, the str1 parameter must point to a string that is tokenized, and on subsequent calls this parameter must be set to NULL. After finding a token, the strtok function writes a null byte after this token in place of the delimiter.

    #include

    #include

    char str = "12 34 ab cd";

    p = strtok(str, " ");

    printf("%s\n", p); // prints the values ​​in a column: 12 34 ab cd

    p = strtok(NULL, " ");

    8. Determining the length of a string. The strlen function is used to determine the length of a string.

    size_t strlen (const char *str);

    returns the length of the string, ignoring the last null byte. For example,

    char str = "123";

    printf("len = %d\n", strlen(str)); // prints: len = 3

    1.7. Functions for working with memory.

    The header file string.h also describes functions for working with memory blocks, which are similar to the corresponding functions for working with strings.

    void* memchr (const void *str, int c, size_t n);

    searches for the first occurrence of the character specified by c in n bytes of the string str.

    int memcmp (const void *str1, const void *str2, size_t n);

    compares the first n bytes of strings str1 and str2.

    void* memcpy (const void *str1, const void *str2, size_t n);

    copies the first n bytes from string str1 to string str2.

    void* memmove (const void *str1, const void *str2, size_t n);

    copies the first n bytes from str1 to str2, ensuring that overlapping strings are handled correctly.

    void* memset (const void *str, int c, size_t n);

    copies the symbol, specified by parameter c, in the first n bytes of the string str.

    Habra, hello!

    Not long ago, a rather interesting incident happened to me, in which one of the teachers of one computer science college was involved.

    The conversation about Linux programming slowly progressed to this person arguing that the complexity of systems programming was actually greatly exaggerated. That the C language is as simple as a match, in fact, like Linux kernel(in his words).

    I had with me a laptop with Linux, which contained a gentleman's set of utilities for development in the C language (gcc, vim, make, valgrind, gdb). I don’t remember what goal we set for ourselves then, but after a couple of minutes my opponent found himself at this laptop, completely ready to solve the problem.

    And literally on the very first lines he made a serious mistake when allocating memory to... a line.

    Char *str = (char *)malloc(sizeof(char) * strlen(buffer));
    buffer - a stack variable into which data from the keyboard was written.

    I think there will definitely be people who will ask, “How could there be anything wrong with this?”
    Believe me, it can.

    And what exactly - read on the cat.

    A little theory - a kind of LikBez.

    If you know, scroll to the next header.

    A string in C is an array of characters, which should always end with "\0" - the end-of-line character. Strings on the stack (static) are declared like this:

    Char str[n] = ( 0 );
    n is the size of the character array, the same as the length of the string.

    Assignment ( 0 ) - “zeroing” the string (optional, you can declare it without it). The result is the same as running the functions memset(str, 0, sizeof(str)) and bzero(str, sizeof(str)). It is used to prevent garbage from being left in uninitialized variables.

    You can also immediately initialize a string on the stack:

    Char buf = "default buffer text\n";
    In addition, a string can be declared as a pointer and memory can be allocated for it on the heap:

    Char *str = malloc(size);
    size - the number of bytes we allocate for the string. Such strings are called dynamic (due to the fact that right size calculated dynamically + the allocated memory size can be increased at any time using the realloc() function).

    In the case of a stack variable, I used the notation n to determine the size of the array; in the case of a heap variable, I used the notation size. And this perfectly reflects the true essence of the difference between a declaration on the stack and a declaration with memory allocation on the heap, because n is usually used when talking about the number of elements. And size is a completely different story...

    Valgrind will help us

    In my previous article I also mentioned it. Valgrind (, two - small how-to) - very useful program, which helps the programmer track down memory leaks and context errors - exactly the things that most often pop up when working with strings.

    Let's look at a short listing that implements something similar to the program I mentioned and run it through valgrind:

    #include #include #include #define HELLO_STRING "Hello, Habr!\n" void main() ( char *str = malloc(sizeof(char) * strlen(HELLO_STRING)); strcpy(str, HELLO_STRING); printf("->\t%s" , str); free(str);
    And, in fact, the result of the program:

    $ gcc main.c $ ./a.out -> Hello, Habr!
    Nothing unusual yet. Now let's run this program with valgrind!

    $ valgrind --tool=memcheck ./a.out ==3892== Memcheck, a memory error detector ==3892== Copyright (C) 2002-2015, and GNU GPL"d, by Julian Seward et al. == 3892== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info ==3892== Command: ./a.out ==3892== ==3892== Invalid write of size 2 ==3892= = at 0x4005B4: main (in /home/indever/prg/C/public/a.out) ==3892== Address 0x520004c is 12 bytes inside a block of size 13 alloc"d ==3892== at 0x4C2DB9D: malloc (vg_replace_malloc.c:299) ==3892== by 0x400597: main (in /home/indever/prg/C/public/a.out) ==3892== ==3892== Invalid read of size 1 == 3892== at 0x4C30BC4: strlen (vg_replace_strmem.c:454) ==3892== by 0x4E89AD0: vfprintf (in /usr/lib64/libc-2.24.so) ==3892== by 0x4E90718: printf (in /usr/ lib64/libc-2.24.so) ==3892== by 0x4005CF: main (in /home/indever/prg/C/public/a.out) ==3892== Address 0x520004d is 0 bytes after a block of size 13 alloc"d ==3892== at 0x4C2DB9D: malloc (vg_replace_malloc.c:299) ==3892== by 0x400597: main (in /home/indever/prg/C/public/a.out) ==3892== -> Hello, Habr! ==3892== ==3892== HEAP SUMMARY: ==3892== in use at exit: 0 bytes in 0 blocks ==3892== total heap usage: 2 allocs, 2 frees, 1,037 bytes allocated ==3892= = ==3892== All heap blocks were freed -- no leaks are possible ==3892== ==3892== For counts of detected and suppressed errors, rerun with: -v ==3892== ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 0 from 0)
    ==3892== All heap blocks were freed - no leaks are possible- there are no leaks, which is good news. But it’s worth lowering your eyes a little lower (although, I want to note, this is just the summary, the main information is a little different place):

    ==3892== ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 0 from 0)
    3 mistakes. In 2 contexts. In such a simple program. How!?

    Yes, very simple. The whole “funny thing” is that the strlen function does not take into account the end-of-line character - “\0”. Even if you explicitly specify it in the incoming line (#define HELLO_STRING “Hello, Habr!\n\0”), it will be ignored.

    Just above the result of the program execution, the line -> Hello, Habr! there is a detailed report of what and where our precious valgrind didn’t like. I suggest you look at these lines yourself and draw your own conclusions.

    Actually, correct version the program will look like this:

    #include #include #include #define HELLO_STRING "Hello, Habr!\n" void main() ( char *str = malloc(sizeof(char) * (strlen(HELLO_STRING) + 1)); strcpy(str, HELLO_STRING); printf("->\ t%s", str); free(str); )
    Let's run it through valgrind:

    $ valgrind --tool=memcheck ./a.out -> Hello, Habr! ==3435== ==3435== HEAP SUMMARY: ==3435== in use at exit: 0 bytes in 0 blocks ==3435== total heap usage: 2 allocs, 2 frees, 1,038 bytes allocated ==3435= = ==3435== All heap blocks were freed -- no leaks are possible ==3435== ==3435== For counts of detected and suppressed errors, rerun with: -v ==3435== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
    Great. There are no errors, +1 byte of allocated memory helped solve the problem.

    What’s interesting is that in most cases both the first and second programs will work the same, but if the memory allocated for the line in which the ending character did not fit was not zeroed, then the printf() function, when outputting such a line, will also output all the garbage after this line - everything will be printed until a line ending character gets in the way of printf().

    However, you know, (strlen(str) + 1) is such a solution. We face 2 problems:

    1. What if we need to allocate memory for a string generated using, for example, s(n)printf(..)? We do not support the arguments.
    2. Appearance. The variable declaration line looks just awful. Some guys also manage to attach (char *) to malloc, as if they write under pluses. In a program where you regularly need to process strings, it makes sense to find a more elegant solution.
    Let's come up with a solution that will satisfy both us and valgrind.

    snprintf()

    int snprintf(char *str, size_t size, const char *format, ...);- a function - an extension of sprintf, which formats a string and writes it to the pointer passed as the first argument. It differs from sprintf() in that str will not write a byte larger than that specified in size.

    The function has one interesting feature- in any case, it returns the size of the generated string (without taking into account the end-of-line character). If the string is empty, then 0 is returned.

    One of the problems I described with using strlen is related to the sprintf() and snprintf() functions. Let's assume that we need to write something into the string str. The final line contains the values ​​of the other variables. Our entry should be something like this:

    Char * str = /* allocate memory here */; sprintf(str, "Hello, %s\n", "Habr!");
    The question arises: how to determine how much memory should be allocated for the string str?

    Char * str = malloc(sizeof(char) * (strlen(str, "Hello, %s\n", "Habr!") + 1)); - it won't work. The strlen() function prototype looks like this:

    #include size_t strlen(const char *s);
    const char *s does not imply that the string passed to s can be a variadic format string.

    This will help us here useful property the snprintf() function I mentioned above. Let's look at the code for the following program:

    #include #include #include void main() ( /* Since snprintf() does not take into account the end of line character, we add its size to the result */ size_t needed_mem = snprintf(NULL, 0, "Hello, %s!\n", "Habr") + sizeof("\0"); char *str = malloc(needed_mem); snprintf(str, needed_mem, "Hello, %s!\n", "Habr"); str); free(str);
    Run the program in valgrind:

    $ valgrind --tool=memcheck ./a.out -> Hello, Habr! ==4132== ==4132== HEAP SUMMARY: ==4132== in use at exit: 0 bytes in 0 blocks ==4132== total heap usage: 2 allocs, 2 frees, 1,041 bytes allocated ==4132= = ==4132== All heap blocks were freed -- no leaks are possible ==4132== ==4132== For counts of detected and suppressed errors, rerun with: -v ==4132== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) $
    Great. We have argument support. Due to the fact that we pass null as the second argument to the snprintf() function, writing to a null pointer will never cause a Seagfault. However, despite this, the function will still return the size required for the string.

    But on the other hand, we had to introduce an additional variable, and the design

    Size_t needed_mem = snprintf(NULL, 0, "Hello, %s!\n", "Habr") + sizeof("\0");
    looks even worse than in the case of strlen().

    In general, + sizeof("\0") can be removed if you explicitly specify "\0" at the end of the format line (size_t needed_mem = snprintf(NULL, 0, "Hello, %s!\n \0 ", "Habr");), but this is by no means always possible (depending on the string processing mechanism, we can allocate an extra byte).

    We need to do something. I thought a little and decided that now was the time to appeal to the wisdom of the ancients. Let's describe a macro function that will call snprintf() with a null pointer as the first argument, and null as the second. And let's not forget about the end of the line!

    #define strsize(args...) snprintf(NULL, 0, args) + sizeof("\0")
    Yes, it may be news to some, but C macros support a variable number of arguments, and the ellipsis tells the preprocessor that the specified macro function argument (in our case, args) corresponds to several real arguments.

    Let's check our solution in practice:

    #include #include #include #define strsize(args...) snprintf(NULL, 0, args) + sizeof("\0") void main() ( char *str = malloc(strsize("Hello, %s\n", "Habr! ")); sprintf(str, "Hello, %s\n", "Habr!"); printf("->\t%s", str); free(str); )
    Let's start with valgrund:

    $ valgrind --tool=memcheck ./a.out -> Hello, Habr! ==6432== ==6432== HEAP SUMMARY: ==6432== in use at exit: 0 bytes in 0 blocks ==6432== total heap usage: 2 allocs, 2 frees, 1,041 bytes allocated ==6432= = ==6432== All heap blocks were freed -- no leaks are possible ==6432== ==6432== For counts of detected and suppressed errors, rerun with: -v ==6432== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
    Yes, there are no errors. Everything is correct. And valgrind is happy, and the programmer can finally go to sleep.

    But finally, I’ll say one more thing. In case we need to allocate memory for any string (even with arguments) there is already fully working ready solution.

    We are talking about the asprintf function:

    #define _GNU_SOURCE /* See feature_test_macros(7) */ #include int asprintf(char **strp, const char *fmt, ...);
    It takes a pointer to a string (**strp) as its first argument and allocates memory to the dereferenced pointer.

    Our program written using asprintf() will look like this:

    #include #include #include void main() ( char *str; asprintf(&str, "Hello, %s!\n", "Habr"); printf("->\t%s", str); free(str); )
    And, in fact, in valgrind:

    $ valgrind --tool=memcheck ./a.out -> Hello, Habr! ==6674== ==6674== HEAP SUMMARY: ==6674== in use at exit: 0 bytes in 0 blocks ==6674== total heap usage: 3 allocs, 3 frees, 1,138 bytes allocated ==6674= = ==6674== All heap blocks were freed -- no leaks are possible ==6674== ==6674== For counts of detected and suppressed errors, rerun with: -v ==6674== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
    Everything is fine, but, as you can see, more memory has been allocated, and there are now three allocs, not two. On weak embedded systems, using this function is undesirable.
    In addition, if we write man asprintf in the console, we will see:

    CONFORMING TO These functions are GNU extensions, not in C or POSIX. They are also available under *BSD. The FreeBSD implementation sets strp to NULL on error.

    This makes it clear that this function is only available in GNU sources.

    Conclusion

    In conclusion, I want to say that working with strings in C is a very complex topic that has a number of nuances. For example, to write “safe” code when dynamically allocating memory, it is recommended to use the calloc() function instead of malloc() - calloc fills the allocated memory with zeros. Or, after allocating memory, use the memset() function. Otherwise, the garbage that was initially located in the allocated memory area may cause problems during debugging, and sometimes when working with the string.

    More than half of the C programmers I know (most of them are beginners) who solved the problem of allocating memory for strings at my request, did it in a way that ultimately led to context errors. In one case - even to a memory leak (well, a person forgot to do free(str), it never happens to anyone). As a matter of fact, this prompted me to create this creation that you just read.

    I hope this article will be useful to someone. Why am I making all this fuss - no language is simple. Everywhere has its own subtleties. And the more subtleties of the language you know, the better your code.

    I believe that after reading this article your code will become a little better :)
    Good luck, Habr!