system("touch");to make sure a file exists, and/or has a recent access/modification time. For example, see here, here and here.
I'm here to tell you:
system()
sucks! Why? Take a look at man system
:
int system(const char *command); DESCRIPTION system() executes a command specified in command by calling /bin/sh -c command, and returns after the command has been completed.So it's not just running the
touch
program. It's starting a shell, then running whatever you passed in that shell. This is:
- Slow. First you start a shell, then you start another program from that shell? Seems like a lot of hassle.
-
A security risk. Say you take the filename from the user, then run something like:
std::stringstream ss; ss << "touch " << filename; system(ss.str().c_str());
What happens if I (the malicious user) give input like"fakename ; rm -rf --no-preserve-root /;"
? Well it creates(/updates the timestamp of)fakename
, then tries to delete everything! -
Very platform dependent. The POSIX Standard has this to say:
[T]he system() function shall pass the string pointed to by command to that command processor to be executed in an implementation-defined manner; this might then cause the program calling system() to behave in a non-conforming manner or to terminate.
And that's justsystem
. The utility you are calling may vary significantly. Alright,touch
probably won't, but I've seen people usesystem
with, for instance,ls
, whose output will vary significantly in format across platforms.
touch
, so we should be able to replicate it's behaviour from our own program. The logic surrounding parsing arguments and so on is something we should be pretty familiar with. What we need to know is how touch
actually creates and updates a file. It needs to make calls out to the operating system ("system calls"). There is a handy command line tool to see what system calls are being made by a program, called strace
(on some systems, truss
. I don't know a full list of which to use where, but I do know it's strace
on Linux and AIX, truss
on Solaris and FreeBSD).
I ran
strace touch
twice, once to create a file, then once to update it. It was basically the same each time, so I'll just show one. You get a lot of cruft just from a program starting up, obtaining heap memory, etc, but I cut it down to just the relevant bits:
$ strace touch testfile ... open("testfile", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3 dup2(3, 0) = 0 close(3) = 0 dup2(0, 0) = 0 utimensat(0, NULL, NULL, 0) = 0 ...The two we care about are
open
and utimensat
. Respectively, these open a file, creating it if necessary (O_CREAT
), and update the timestamp.
open
takes:
-
const char* pathname
The path (absolute or relative) of the file (or directory) to be opened. -
int flags
A bitmask of flags,or
d together, indicating how to open the path, e.g.O_CREAT
to create the file if it doesn't already exist. -
mode_t mode
Only required ifO_CREAT
is provided, this argument provides the permissions with which to create the file. This will be filtered against your umask:mode^umask
.
utimensat
takes:
-
int dirfd
An open file descripter to a directory from which to interpret a relative path. We will use the special valueAT_FDCWD
, which just means we interpret relative paths from the working directory of the program. -
const char* pathname
As above. -
const struct timespec times[2]
Two sets of values defining the times to be set. By passing a null pointer for this array, we just get the current time. -
int flags
Another bitmask specifying details of how the call will be carried out. Nothing relevant to us.
c++
way, and get something like:
#include <sys/types.h> #include <sys/stat.h> #include <sys/time.h> #include <fcntl.h> #include <unistd.h> #include <utime.h> #include <iostream> #include <string> #include <cstdlib> void touch(const std::string& pathname) { int fd = open(pathname.c_str(), O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666); if (fd<0) // Couldn't open that path. { std::cerr << __PRETTY_FUNCTION__ << ": Couldn't open() path \"" << pathname << "\"\n"; return; } int rc = utimensat(AT_FDCWD, pathname.c_str(), nullptr, 0); if (rc) { std::cerr << __PRETTY_FUNCTION__ << ": Couldn't utimensat() path \"" << pathname << "\"\n"; return; } std::clog << __PRETTY_FUNCTION__ << ": Completed touch() on path \"" << pathname << "\"\n"; } int main(int argc, char* argv[]) { if (argc!=2) return EXIT_FAILURE; touch (argv[1]); return EXIT_SUCCESS; }Of course, it would be very easy to rewrite this function in
c
.
Also, if you only want to make sure the file exists, and don't care about the timestamps, you could just create a std::ofstream
(remembering to pass app
and check is_open()
).
Do you know why touch includes this explicit zeroing of the subsecond portion of the time using that utimensat() syscall? Why not let the subsecond time float with the actual file creation time, retaining the subsecond granularity if the filesystem supports it?
ReplyDeleteYou have a memory leak.
ReplyDeleteShould close file descriptor after usage.
Your strace output has the pathname as NULL. Reading the Linux manpage, we find that, on Linux: "futimens(fd, times) is implemented as: utimensat(fd, NULL, times, 0);". (This is not a standard feature of utimensat().)
ReplyDeleteSo, this tells us that touch is using futimens(), not utimensat(), which makes sense. It avoids a second path lookup, but it also means that there can't be any race conditions where the file system changes and you unintentionally update the time on a different file that you just opened/created. So, your implementation would be better if it switched to futimens(), too. It might be more secure, too given that the multiple path lookup pattern is frequently has security implications, even if I don't see it in this case.
Great!
DeleteTouch is open source. A better approach then strace is to simply look at the source code.
ReplyDeleteWhat if it does some crucial conditional for exceptional cases? Or important conditional compilation for compatibility?
https://github.com/wertarbyte/coreutils/blob/master/src/touch.c
Helpful in 2019 :P :)
ReplyDelete