Skip to content
This repository was archived by the owner on Jun 29, 2020. It is now read-only.

Directly execute start.jar rather than use shell script #16

Closed
wants to merge 4 commits into from
Closed

Directly execute start.jar rather than use shell script #16

wants to merge 4 commits into from

Conversation

gregw
Copy link
Contributor

@gregw gregw commented Aug 26, 2015

Rather than use the shell script to start Jetty, I believe it would be better to directly run the start.jar

The jetty shell script is really intended as an /etc/init.d script and does all the location of java, home and base, which is already known in the ENV settings of the Dockerfile.

So we can avoid creating a shell process, which will be some time and space savings, but more importantly we are able to pass in command line arguments to the start.jar, so you can do things like:

docker run jetty-9.3-jre8 --list-config

To see the configuration of the image.

docker run jetty-9.3-jre8 --module=debug

runs jetty, but with the debug module enabled

docker run jetty-9.3-jre8 --dry-run

Reports the exact command line that would be run by start.jar, which can be used to craft an Entry-Point for another image that avoids even the start.jar startup time and memory.

I've only updated 9.3 image at the moment.... if this is considered a good idea I can do all the versions... or perhaps we only do it for versions going forward? or perhaps it is a bad idea for some reason I don't know yet???

@md5
Copy link
Member

md5 commented Aug 26, 2015

I like where you're going with this, but I think we'll need to refine it a bit.

One thing in particular that stands out as going against the best practices for Official Images (and Dockerfiles in general), is the use of ENTRYPOINT in this way (you can see the best practices here). There are only a couple of official images that I'm aware of that use the ENTRYPOINT like this and they're generally images that don't have a shell in them at all (swarm comes to mind).

Most of the official images instead use a small Bash wrapper script that detect whether the first argument to the CMD starts with "-" and, if so, prepends the same command specified in the Dockerfile's CMD. See elasticsearch for an example, though there are many more like this: https://github.com/docker-library/elasticsearch/blob/master/docker-entrypoint.sh#L6

Many of those wrapper scripts also add a call to gosu to drop privileges to a specific user, but we're using Jetty's setuid module to do that here.

One thing that concerns me about the approach normally used in those wrapper scripts is that start.jar will take arguments that don't start with "-", so it may be difficult to write a wrapper script that does the right thing (e.g. jetty.sh is currently passing jetty.state=$JETTY_STATE as an argument to start.jar, so a user may want to pass properties that way as well).

Another thing I noticed is that we'll no longer be setting jetty.state, which jetty.sh currently does based on the JETTY_STATE variable. Since I was relying on jetty.state living under /run/jetty when I documented how to run a read-only jetty container, I'm concerned that this could be a breaking change (see the docs here: https://github.com/docker-library/docs/tree/master/jetty#read-only-container). I suppose that could be avoided by passing -Djetty.state=/run/jetty/jetty.state in the default CMD. The logic for separating jetty.state out of JETTY_BASE was to allow JETTY_BASE to remain immutable once a read-only container is started.

It also looks like the JETTY_RUN environment variable could be dropped from the Dockerfile, since it's only used in jetty.sh to control the location of the pid file that we'll no longer have.

I may have some other comments once I sleep on it, but like I said initially, I definitely support the goal of allowing the users to pass command line arguments to start.jar in this way. I'd just like to do so in a way that avoids breaking things for existing users (there's currently 52k+ pulls from Docker Hub).

@md5
Copy link
Member

md5 commented Aug 26, 2015

BTW, the rationale for the restricted use of ENTRYPOINT in most of the official images is to allow users to do things like docker run --rm -it "$IMAGE_NAME" bash in order to explore the image filesystem interactively. Using an ENTRYPOINT the way you have in this PR will mean that users have to do docker run --rm -it --entrypoint bash "$IMAGE_NAME" instead, which the Official Image project maintainers want to avoid where possible.

@gregw
Copy link
Contributor Author

gregw commented Aug 26, 2015

@md5, yeh I was a bit concerned about breaking changes....
However the whole jetty.state stuff is only used by the jetty.sh mechanism, which uses the jetty.conf file to add the jetty-started.xml file to the effective command line that then creates the state file. If we start jetty directly, there is no jetty.state file created (unless use puts the jetty-started.xml file on command line).

The only purpose of the jetty.state file is so that "jetty.sh start" can wait to report success or failure before exiting. Since the current image uses "jetty.sh run", this is not required and another reason to look to avoid the complex moving parts of jetty.sh. I can't think that anybody else would be using the state file, so if it doesn't get created it should not break your read-only containers.

Thus I think the biggest problem is the use of entry point that is against the conventions of other containers. So perhaps a simple shell wrapper that looks at the first passed parameter and if it is executable (eg bash), then execute the whole line, otherwise just append to java -jar start.jar

Note that one of the drivers for this for me is start time, were I have a need to spin up containers ASAP, so even the few ms to start a shell and then exec java are seen as unacceptable cost... but then we are likely to replace whatever CMD is there by the exact command generated by --dry-run anyway... so a little shell script is OK.

Which takes me full circle to thinking that if there is a shell script... then why not use the standard one. But then jetty.sh also does stuff like looking for /etc/default/jetty and /etc/jetty.conf for additional configuration... so depending on if that is ever used or not, that is either more unnecessary work or breaking changes :(

I'll ponder on this some more.

cheers

@gregw
Copy link
Contributor Author

gregw commented Aug 26, 2015

I just pushed a change with a few clean ups and moved the ENTRYPOINT to a CMD. This is a not so bad compromise as the following commands are all simple enough and work:

docker run jetty-9.3-jre8
docker run --rm -it jetty-9.3-jre8 bash
docker run --rm -it jetty-9.3-jre8 java -jar /usr/local/jetty/start.jar --list-config

The last is a little verbose and not so obvious... but documentation can help with that.

So this avoids shell scripts and allows the bash convention to be observed.

One additional breaking change may be that we need to add the logging module, as currently jetty.sh adds jetty-logging.xml to the command line, so this PR removes it and thus changes behaviour. I kind of like the log coming out stderr... but that would be a change.

thoughts?

@gregw
Copy link
Contributor Author

gregw commented Aug 26, 2015

Ah ignore the jetty-logging comment, I see that is stripped out of the jetty.conf anyway! No need to do that now... another push coming...

@gregw
Copy link
Contributor Author

gregw commented Aug 26, 2015

Note also, regarding the setuid, I don't think that is really necessary. You would use setuid if you want to open port 80 and then change to the jetty user. But this image opens 8080, so it is probably better just to use USER?

Even if you want to use 80, it is probably better to use ipchains or iptables to map to 8080 rather than open as root and setuid.

@md5
Copy link
Member

md5 commented Aug 26, 2015

I'm not a big fan of using USER, since it complicates creating derived images that need to do things as root. My original reasoning is here:

The reason I didn't use USER jetty is probably the same reason that (for example) ghost uses gosu instead of USER user, i.e. ease of creating derived images without having to switch back to USER root. In this case, Jetty provides the functionality built-in, so I didn't want to add gosu.

@md5
Copy link
Member

md5 commented Aug 26, 2015

We should be able to revisit that decision, but it will break some of the existing derived images. I see at least a few doing thing that require root, but they're often just a chown -R jetty:jetty (which would actually work fine as USER jetty):

https://github.com/search?l=Dockerfile&q=from+jetty&type=Code&utf8=%E2%9C%93

The big problem with Docker images in general is that there's no in-band way to notify users of breaking changes. Couple this with the standard practice of only tagging images with a (mutable) tag based on the upstream version and it becomes really easy to break a user's installation unexpectedly. 😢

@md5
Copy link
Member

md5 commented Aug 26, 2015

Also, there was some discussion of publishing the SHA256 digest of official images in the README to allow users to pin to a specific build of the image, but I can't for the life of me find it.

@md5
Copy link
Member

md5 commented Aug 26, 2015

@tianon @yosifkit Do you recall where that discussion about the digests was? I can't find it in docker-library/docs or docker-library/official-images, so I think it must have been in one of the individual repos...

@md5
Copy link
Member

md5 commented Aug 26, 2015

I seem to remember the discussion being related to the postgres image, but I can't find it in docker-library/postgres either.

@md5 md5 changed the title Directly execut start.jar rather than use shell script Directly execute start.jar rather than use shell script Aug 26, 2015
@md5
Copy link
Member

md5 commented Aug 26, 2015

Found it! The discussion starts here: docker-library/postgres#75 (comment)

@md5
Copy link
Member

md5 commented Aug 26, 2015

@gregw One more thing regarding the setuid module that I'll add is that I find it strange that you're concerned about the overhead of a small amount of shell scripting at container startup time, but not about the overhead of NAT-ing the network connection. I'd think that startup time would be dominated by JVM start time and Jetty initialization itself.

Using setuid to bind port 80 will let users use a non-NAT networking mode like macvlan or --net=host to get the optimal networking performance at run time, which to me trumps any gain in container start time.

@md5
Copy link
Member

md5 commented Aug 26, 2015

It could also let them do things as root (like read secret keys or something) before the SetUIDListener takes over, meaning that a breakout as user jetty would be unable to read those secrets itself without an additional privilege escalation.

@md5
Copy link
Member

md5 commented Aug 26, 2015

@tianon What do you think about an approach like the following instead of the if [ "${1:0:1}" = '-' ]; then approach that is normally taken:

if ! which "$1" >/dev/null; then
    set -- java -jar /usr/local/jetty/start.jar "$@"
fi

???

@md5
Copy link
Member

md5 commented Aug 26, 2015

@gregw I'll also mention that the reason I'm setting TMPDIR to /tmp/jetty is to easily allow users to mount that directory as a volume in the --read-only case without mounting all of /tmp. The reason I did that was that there was some discussion at the time I did so about allowing /run and /tmp inside the container to be tmpfs mounts.

@tianon
Copy link
Contributor

tianon commented Aug 26, 2015 via email

@md5
Copy link
Member

md5 commented Aug 26, 2015

@tianon 👍

I doubt that the things being passed to start.jar would be otherwise valid commands, but @gregw would know better.

@gregw
Copy link
Contributor Author

gregw commented Aug 26, 2015

@md5 - I probably have raised too many issues in this PR. So let's put some to bed:)

I'm sold on setuid - good as it is.
Happy for temp to be put in /tmp/jetty

I understand the need for not using ENTRYPOINT and am happy with how the PR currently is with a CMD. This works for the existing cases exactly the same as now. If you want to pass extra args into start.jar, then you have to reproduce the whole line... a bit verbose, but not too bad.

With regards to avoiding the jetty.sh invocation, while speed is part of my concern, it is mostly motivated by the fact that that script is not well suited for that task and is looking for /etc configuration and rediscovering env variables that are already known. Ie it is a moving part that is not needed.

So I think the PR as it currently stands is worthy of consideration. It removes the jetty.sh script and simplifies things, but should not change any significant behaviour.

For the record, start.jar can process:

  • JVM options like -D (it then execs another JVM)
  • its own options like --modules
  • properties in the form of name=value (used for XML evaluation)
  • property files that are expanded (also used for XML evaluation)
  • XML files (that are executed with our XmlConfiguration mechanism)

But I see no need for a ENTRYPOINT script as the direct execution of start.jar as a CMD is simple and can be replaced as needed.

Actually if it is the norm to not use ENTRYPOINT, then perhaps we should raise a feature request on docker run to support additive args to CMD... but that is definitely a different issue.

@md5
Copy link
Member

md5 commented Aug 26, 2015

The norm for ENTRYPOINT for official images is actually to support being able to pass arguments to the default CMD. I'll create a PR against your branch in a few hours after I get home to add a simple entrypoint. This will avoid having to duplicate the default command line if you just want to add arguments, but will still allow completely custom commands if needed.

Once that's merged into your branch (assuming it makes sense to you), we can squash and then merge your PR into master here. I might want to whip up a few quick tests for jetty to add to the official-images repository to makes sure that we're maintaining existing functionality.

In the future, I think it would be fine if you create branches in this repo directly. That will avoid the back-and-forth dance of me having to send me a PR for your PR 👍

One question I still have is about the removal of the sed command that removes jetty-logging from jetty.conf. The reason I had that in there is to ensure that Jetty logs to STDOUT and STDERR as per the Docker best practices (instead of to a file, which I thought was the default before I made that change).

@gregw
Copy link
Contributor Author

gregw commented Aug 27, 2015

Ah I misunderstood the best practise on ENTRYPOINT. So a little script is best practise, so let's go with that - it avoids jetty.sh and allows easy additional args to be passed.

Perhaps the best check to do is [which $1] ie if the first argument is executable, then execute it, else pass it to start.jar as extra arguments

With regards to removing jetty-logging from jetty.conf, that file is only used if jetty.sh is used. So no need to edit it. If you want, perhaps remove it and jetty.sh to be clear?

Do you want me to have a go at creating the script rather than a PR on a PR?

+1 on branches in future :)

@md5
Copy link
Member

md5 commented Aug 27, 2015

Here's the script I would use (not tested, but it should work unless I flubbed something):

#!/bin/bash

set -e

if [ "$1" = jetty.sh ]; then
    cat >&2 <<- 'EOWARN'
        ********************************************************************
        WARNING: Use of jetty.sh from this image is deprecated and may
             be removed at some point in the future.

             See the documentation for guidance on extending this image:
             https://github.com/docker-library/docs/tree/master/jetty
        ********************************************************************
    EOWARN
fi

if ! type "$1" &>/dev/null; then
    set -- java -jar "-Djava.io.tmpdir=$TMPDIR" "$JETTY_HOME/start.jar" "$@"
fi

exec "$@"

We're going to want to have a copy of that script at the root of the repository and some additional code in update.sh to keep the entrypoint consistent across the images.

Also, since existing users could have done customization based on jetty.sh, I'd prefer to leave it in the image. This probably means we need to leave the PATH addition and the removal of jetty-logging as well to support existing derived images. Adding a comment that the jetty-logging removal is only needed to support deprecated usage of jetty.sh could make sense though.

@md5
Copy link
Member

md5 commented Aug 27, 2015

In the Dockerfile, we'll want to COPY that script to /docker-entrypoint.bash and use that as the ENTRYPOINT. It should have the same name in each image directory and be properly chmod'd. 👍

@md5
Copy link
Member

md5 commented Aug 27, 2015

We might also want a comment in the script before the type "$1" magic to say that it's prepending the default command to $@ in the case that an alternate command is not specified.

@md5
Copy link
Member

md5 commented Aug 27, 2015

I just noticed that GitHub rendered my tabs as four spaces instead of actual tabs... I think that's because I used bash syntax highlighting instead of an unstyled fenced block.

@gregw
Copy link
Contributor Author

gregw commented Aug 27, 2015

Pushed another iteration - now using docker-entrypoint.bash

@md5
Copy link
Member

md5 commented Aug 27, 2015

I'd actually prefer a more minimal diff that reflects just the functionality that has changed. I just spent 5 minutes before I saw that you'd pushed a new commit and did this: https://github.com/appropriate/docker-jetty/compare/use-start-jar

@md5
Copy link
Member

md5 commented Aug 27, 2015

One difference is that I left the CMD in the Dockerfile, even though it's redundant. This is mainly for documentation purposes and is the way the other official images do it.

@md5
Copy link
Member

md5 commented Aug 27, 2015

Also, with the <<- form of heredoc, the start and end need to be equally indented.

@md5
Copy link
Member

md5 commented Aug 27, 2015

Do you want me to do a PR to jetty-project:master so you can squash the commits and keep everything in this same PR? That will record you as the author instead of me, but I don't care either way.

@gregw
Copy link
Contributor Author

gregw commented Aug 27, 2015

@md5 I'm not worried about commit credit, so whatever is simplest.
Also OK on making minimal changes (although there is scope to simplify with fewer RUNs that can be done later or never...)

But I think I'm missing about that docker-entrypoint.bash? If it is not set as the ENTRYPOINT, then how is it executed? Is it an automagic thing?

Your branch currently fails on :

> docker run -it --rm jetty-9.3-jre8 --list-config
exec: "--list-config": executable file not found in $PATH
Error response from daemon: Cannot start container c4e6cf5251136fef895ea0a16cd01d94a4912f997f46e7b8d9028619d922f73b: [8] System error: exec: "--list-config": executable file not found in $PATH

So I'm not understanding something???

Note also that it does work for

>docker run -it --rm jetty-9.3-jre8 jetty.sh run
Running Jetty: 
...

but no warning is produced?

out of time today.... will look again tomorrow.

@md5
Copy link
Member

md5 commented Aug 27, 2015

Gah! You're totally I right about my missing the ENTRYPOINT directive. Duh...

I'll fix this up and squash it.

@md5
Copy link
Member

md5 commented Aug 27, 2015

I've got a working version over in #17. Squashing left you as the author since your commit was first.

I'm going to update the tests to add a jetty test that works against the current version of the image and add it to .travis.yml. After that, I'll merge #17 which will close this PR. Once that's done, I'll add a further test or two that tests the new functionality.

@md5
Copy link
Member

md5 commented Aug 27, 2015

Thanks for initiating these changes! 👍

@gregw
Copy link
Contributor Author

gregw commented Aug 27, 2015

@md5 - all good! manual testing working as expected for me.

Only quibble is that perhaps the JETTY_RUN and JETTY_STATE env variables should be removed.

I've also noticed that it is missing a -R on the chown for JETTY_BASE, but I'll open that as another issue and do a direct commit if nobody objects.

@gregw
Copy link
Contributor Author

gregw commented Aug 27, 2015

Closing as now covered by #17

@gregw gregw closed this Aug 27, 2015
@md5
Copy link
Member

md5 commented Aug 27, 2015

The reason I left JETTY_RUN and JETTY_STATE is to support the case where jetty.sh run (or some variant thereof) is passed as the command explicitly. This could be the case (for example) if someone already had a derived image with its own ENTRYPOINT and had to put back CMD ["jetty.sh", "run"] (since setting ENTRYPOINT resets the parent image's CMD, IIRC).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants