Skip to content
This repository was archived by the owner on Nov 9, 2017. It is now read-only.

Msvc build fix (for generating git.sln) #317

Closed
wants to merge 353 commits into from

Conversation

PhilipOakley
Copy link

The 'msvc-build --vs' for creating a git.sln file that is readable and compilable by Visual Studio 2008 (last working version;-) had bit rotted.

This series, on top of 1.9.5.msysgit.0 (commit/0d6f7c802) fixes most of the issues, including: the duplicate GUID, perl make dependencies, many debug improvements, and others.

It's still not perfect, but does now run properly to create a git.sln which will build without error in VS08 express ed.

Note that 'make common-cmd.h' is still required before the 'msvc-build --vs' will compile without error.

Ultimately this will be submitted to upstream Git once review feedback has been obtained, so I'm not expecting a merge (obviously), rather I'd like others to see if it works for them.

Philip

Evgeny Pashkin and others added 30 commits June 2, 2014 10:58
On Windows XP3 in git bash
git clone [email protected]:octocat/Spoon-Knife.git
cd Spoon-Knife
git gui
menu Remote\Fetch from\origin
error: cannot spawn git: No such file or directory
error: could not run rev-list

if u run
git fetch --all
it worked normal in git bash or gitgui tools

In second version CreateProcess get 'C:\Git\libexec\git-core/git.exe' in
first version - C:/Git/libexec/git-core/git.exe and not executes (unix
slashes)

after fixing C:\Git\libexec\git-core\git.exe or
C:/Git/libexec/git-core\git.exe it works normal

Signed-off-by: Johannes Schindelin <[email protected]>
If Git were installed in a path containing non-ASCII characters,
commands such as git-am and git-submodule, which are implemented as
externals, would fail to launch with the following error:

> fatal: 'am' appears to be a git command, but we were not
> able to execute it. Maybe git-am is broken?

This was due to lookup_prog not being Unicode-aware. It was somehow
missed in 2ee5a1a.

Note that the only problem in this function was calling
GetFileAttributes instead of GetFileAttributesW. The calls to access()
were fine because access() is a macro which resolves to mingw_access,
which already handles Unicode correctly. But I changed lookup_prog to
use _waccess directly so that we only convert the path to UTF-16 once.

Signed-off-by: Adam Roben <[email protected]>
7ebac8c made launching of .exe
externals work when installed in Unicode paths. But it broke launching
of non-.exe externals, no matter where they were installed. We now
correctly maintain the UTF-8 and UTF-16 paths in tandem in lookup_prog.

This fixes t5526, among others.

Signed-off-by: Adam Roben <[email protected]>
The previous implementation said that the filesystem information on
Windows is not reliable to determine whether a file is executable.
To find gather this information it was peeking into the first two bytes
of a file to see whether it looks executable.
Apart from the fact that on Windows executables are usually defined as
such by their extension it lead to slow opening of help file in some
situations.

When you have virus scanner running calling open on an executable file
is a potentially expensive operation. See the following measurements (in
seconds) for example.

With virus scanner running (coldcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.000000
after open (git-add.exe): 0.412873
before open (git-annotate.exe): 0.000175
after open (git-annotate.exe): 0.397925
before open (git-apply.exe): 0.000243
after open (git-apply.exe): 0.399996
before open (git-archive.exe): 0.000147
after open (git-archive.exe): 0.397783
before open (git-bisect--helper.exe): 0.000160
after open (git-bisect--helper.exe): 0.397700
before open (git-blame.exe): 0.000160
after open (git-blame.exe): 0.399136
...

With virus scanner running (hotcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.000000
after open (git-add.exe): 0.000325
before open (git-annotate.exe): 0.000229
after open (git-annotate.exe): 0.000177
before open (git-apply.exe): 0.000167
after open (git-apply.exe): 0.000150
before open (git-archive.exe): 0.000154
after open (git-archive.exe): 0.000156
before open (git-bisect--helper.exe): 0.000132
after open (git-bisect--helper.exe): 0.000180
before open (git-blame.exe): 0.000718
after open (git-blame.exe): 0.000724
...

This test did just list the given directory and open() each file in it.

With this patch I get:

$ time git help git
Launching default browser to display HTML ...

real    0m8.723s
user    0m0.000s
sys     0m0.000s

and without

$ time git help git
Launching default browser to display HTML ...

real    1m37.734s
user    0m0.000s
sys     0m0.031s

both tests with cold cache and giving the machine some time to settle
down after restart.

Signed-off-by: Heiko Voigt <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
Commit 5fee995 introduced the stage_updated_gitmodules() function to
add submodule configuration updates to the index. It assumed that even
after calling remove_cache_entry_at() the same cache entry would still be
valid. This was true in the old days, as cache entries could never be
freed, but that is not so sure in the present as there is ongoing work to
free removed cache entries, which makes this code segfault.

Fix that by calling add_file_to_cache() instead of open coding it. Also
remove the "could not find .gitmodules in index" warning, as that won't
happen in regular use cases (and by then just silently adding it to the
index we do the right thing).

Thanks-to: Karsten Blees <[email protected]>
Signed-off-by: Jens Lehmann <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
The existing hashtable implementation (in hash.[ch]) uses open addressing
(i.e. resolve hash collisions by distributing entries across the table).
Thus, removal is difficult to implement with less than O(n) complexity.
Resolving collisions of entries with identical hashes (e.g. via chaining)
is left to the client code.

Add a hashtable implementation that supports O(1) removal and is slightly
easier to use due to builtin entry chaining.

Supports all basic operations init, free, get, add, remove and iteration.

Also includes ready-to-use hash functions based on the public domain FNV-1
algorithm (http://www.isthe.com/chongo/tech/comp/fnv).

The per-entry data structure (hashmap_entry) is piggybacked in front of
the client's data structure to save memory. See test-hashmap.c for usage
examples.

The hashtable is resized by a factor of four when 80% full. With these
settings, average memory consumption is about 2/3 of hash.[ch], and
insertion is about twice as fast due to less frequent resizing.

Lookups are also slightly faster, because entries are strictly confined to
their bucket (i.e. no data of other buckets needs to be traversed).

Signed-off-by: Karsten Blees <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Since commit 0c499ea the send-pack builtin uses the side-band-64k
capability if advertised by the server.

Unfortunately this breaks pushing over the dump git protocol if used
over a network connection.

The detailed reasons for this breakage are (by courtesy of Jeff Preshing,
quoted from ttps://groups.google.com/d/msg/msysgit/at8D7J-h7mw/eaLujILGUWoJ):
----------------------------------------------------------------------------
MinGW wraps Windows sockets in CRT file descriptors in order to mimic the
functionality of POSIX sockets. This causes msvcrt.dll to treat sockets as
Installable File System (IFS) handles, calling ReadFile, WriteFile,
DuplicateHandle and CloseHandle on them. This approach works well in simple
cases on recent versions of Windows, but does not support all usage patterns.
In particular, using this approach, any attempt to read & write concurrently
on the same socket (from one or more processes) will deadlock in a scenario
where the read waits for a response from the server which is only invoked after
the write. This is what send_pack currently attempts to do in the use_sideband
codepath.
----------------------------------------------------------------------------

The new config option "sendpack.sideband" allows to override the side-band-64k
capability of the server, and thus makes the dump git protocol work.

Other transportation methods like ssh and http/https still benefit from
the sideband channel, therefore the default value of "sendpack.sideband"
is still true.

Signed-off-by: Thomas Braun <[email protected]>
This reverts commit 00764ca, as our ancient version of "cp" has
problems about the "new" POSIX option "-P" (yields exit code 1).
Move opendir down in preparation for the next patch.

Signed-off-by: Karsten Blees <[email protected]>
Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is
pretty staightforward, however, most of the information provided in the
WIN32_FIND_DATA structure is thrown away in the process. A more
sophisticated implementation may cache this data, e.g. for later reuse in
calls to lstat.

Make the dirent implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Define a base DIR structure with pointers to readdir/closedir that match
the opendir implementation (i.e. similar to vtable pointers in OOP).
Define readdir/closedir so that they call the function pointers in the DIR
structure. This allows to choose the opendir implementation on a
call-by-call basis.

Move the fixed sized dirent.d_name buffer to the dirent-specific DIR
structure, as d_name may be implementation specific (e.g. a caching
implementation may just set d_name to point into the cache instead of
copying the entire file name string).

Signed-off-by: Karsten Blees <[email protected]>
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.

Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Signed-off-by: Karsten Blees <[email protected]>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.

This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.

Enable read-only sections for 'git status' and preload_index.

Signed-off-by: Karsten Blees <[email protected]>
Checking the work tree status is quite slow on Windows, due to slow lstat
emulation (git calls lstat once for each file in the index). Windows
operating system APIs seem to be much better at scanning the status
of entire directories than checking single files.

Add an lstat implementation that uses a cache for lstat data. Cache misses
read the entire parent directory and add it to the cache. Subsequent lstat
calls for the same directory are served directly from the cache.

Also implement opendir / readdir / closedir so that they create and use
directory listings in the cache.

The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.

Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.

Signed-off-by: Karsten Blees <[email protected]>
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).

Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.

Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).

Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).

Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).

Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.

Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.

While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.

Test suite:
Test the case is when the full pathname length of a dir is close
to 260 (MAX_PATH).
Bug report and an original reproducer by Andrey Rogozhnikov:
msysgit#122 (comment)

Thanks-to: Martin W. Kirst <[email protected]>
Thanks-to: Doug Kelly <[email protected]>
Signed-off-by: Karsten Blees <[email protected]>
Original-test-by: Andrey Rogozhnikov <[email protected]>
Signed-off-by: Stepan Kasal <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
kblees and others added 24 commits December 17, 2014 22:03
If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.

On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.

Signed-off-by: Karsten Blees <[email protected]>
test_cmp() is primarily meant to compare text files (and display the
difference for debug purposes).

Raw "cmp" is better suited to compare binary files (tar, zip, etc.).

On MinGW, test_cmp is a shell function mingw_test_cmp that tries to
read both files into environment, stripping CR characters (introduced
in commit 4d715ac).

This function usually speeds things up, as fork is extremly slow on
Windows.  But no wonder that this function is extremely slow and
sometimes even crashes when comparing large tar or zip files.

Signed-off-by: Stepan Kasal <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
…rectories-once

fscache: load directories only once
'git checkout' fails if a directory is longer than PATH_MAX, because the
lstat_cache in symlinks.c checks if the leading directory exists using
PATH_MAX-bounded string operations.

Remove the limitation by using strbuf instead.

Signed-off-by: Karsten Blees <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Rename cache_def_free to cache_def_clear as it doesn't free the struct
cache_def, but just clears its content.

Signed-off-by: Karsten Blees <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Signed-off-by: Nguyễn Thái Ngọc Duy <[email protected]>
Signed-off-by: Karsten Blees <[email protected]>
There is no actual nested struct here. Move it out for clarity.

Signed-off-by: Nguyễn Thái Ngọc Duy <[email protected]>
Signed-off-by: Karsten Blees <[email protected]>
This also fixes the problem of silently ignoring .gitignore if the
full path exceeds PATH_MAX. Now add_excludes_from_file() will report
if it gets ENAMETOOLONG.

Signed-off-by: Nguyễn Thái Ngọc Duy <[email protected]>
Signed-off-by: Karsten Blees <[email protected]>
Use a suffciently large buffer to strip the trailing slash.

Signed-off-by: Karsten Blees <[email protected]>
Fixes a small bug affecting push to remotes which use some sort of
multi-pass authentication. In particular the bug affected SabreDAV as
configured by Box.com [1].

It must be a weird server configuration for the bug to have survived
this long. Someone should write a test for it.

[1] http://marc.info/?l=git&m=140460482604482

Signed-off-by: Abbaad Haider <[email protected]>
Reviewed-by: Jeff King <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
…th-long-directories

Win32: fix checkout problem with directories exceeding MAX_PATH
Backport fixes from 2.0.x maintance releases
A 'make -n' dry-run is used as part of the /compat/vcbuild and
/contrib/buildsystems code. ee9be06 (perl: detect new files in
MakeMaker builds, 2012-07-27) was not aware of that usage.

Add a comment to the make file stating the issue and the available
solutions of either NO_PERL or a '+recipe'.

Signed-off-by: Philip Oakley <[email protected]>
ee9be06 (perl: detect new files in MakeMaker builds, 2012-07-27)
did not include dry-run support for the generation of the PM.stamp
file, though the dry-run output is used by the build engine.

Disable the perl processing during the dry-run to avoid the issue.

Signed-off-by: Philip Oakley <[email protected]>
Delete the duplicated GUID from the generation code for the Visual Studio
.sln project file.

The duplicate GUID tended to be allocated to test-svn-fe, which was then
ignored by Visual Studio / MSVC, and it's omission from the build never
noticed.

Signed-off-by: Philip Oakley <[email protected]>
The engine.pl script barfs on the properly quoted spaces in
filename options prevalent on Windows. Use shellwords() rather
than split() to separate such options.

Signed-off-by: Philip Oakley <[email protected]>
The i18n 5e9637c (i18n: add infrastructure for translating
Git with gettext, 2011-11-18) introduced an extra '-o' option
into the make file.

If the msvc buildsystem is run without NO_GETTEXT being set
then this broke the engine.pl code for extracting the git.sln
for msvc gui-IDE. The setting of NO_GETTEXT was not fixed until
later, relative to the Msysgit project where this issue was being
investigated.

The presence of these options in the Makefile output should not
compromise the derived build structure. They should be ignored.

Add tests to remove these non linker options, in same vein as
74cf9bd (engine.pl: Fix a recent breakage of the buildsystem
generator, 2010-01-22).

Signed-off-by: Philip Oakley <[email protected]>
Commit 4b623d8 (MSVC: link in invalidcontinue.obj for better
POSIX compatibility, 2014-03-29) is not processed correctly
by the buildsystem. Ignore it.

Also split the .o and .obj processing; 'make' does not produce .obj
files. Only substitute filenames ending with .o when generating the
source .c filename.

Signed-off-by: Philip Oakley <[email protected]>
Layout the 'either/or' with more white space to clarify
which alternatives are matched up.

Reference the Msysgit build script which automates one sequence of options.

Signed-off-by: Philip Oakley <[email protected]>
Save the stderr from the dry MSVC make to a well named file for
later review. Use 'msvc-build-makedryerrors.txt' which should be
obvious as to its source, and is not ignored by 'git status'.

Signed-off-by: Philip Oakley <[email protected]>
Keep the build clean of extraneous files if it is indeed clean.
Otherwise leave the msvc-build-makedryerrors.txt file both as
a flag for any CI system or for manual debugging.

Note that the file will contain the new values of the GIT_VERSION
and GITGUI_VERSION if they were generated by the make file. They
are omitted if the release is tagged and indentically defined in
their respective GIT_VERSION_GEN file DEF_VER variables.

Signed-off-by: Philip Oakley <[email protected]>
Highlight a debug suggestion for capturing the dry-run of the make file
used in determining the msvc-build structure.

Signed-off-by: Philip Oakley <[email protected]>
Assist developers transitioning between the two cultures
by including appropriate, but commented out, debug statements.

The exception is when an unhandled compiler option is detected,
where printing of the full line will supplement the line number and
option part. Otherwise the OP has no immediate mechanism for
inspecting the relevant part of the makedry output.

These debug print statements act as a guide for a poor man's --verbose
option. The test suite doesn't cover the contrib/buildsystems (or
Msysgit's msvc-build) contributions so fails to notice breakages there-in.

It is doubly hard to get developers to ride both horses so, contrary to
normal convention, retain selected debug statements as a safety net for
those willing to try.

Signed-off-by: Philip Oakley <[email protected]>
@buildhive
Copy link

MSysGit - the development behind Git for Windows » git #260 FAILURE
Looks like there's a problem with this pull request
(what's this?)

@sschuberth
Copy link

Something went wrong, this PR has 250+ commits from others than only you ...

@dscho
Copy link
Member

dscho commented Feb 12, 2015

This series, on top of 1.9.5.msysgit.0 (commit/0d6f7c802)

Would you terribly mind switching this PR to target maint-1.9 instead of master? The merge conflicts reported by BuildHive would probably go away, then.

@PhilipOakley
Copy link
Author

I just need to work out how to do that change - any suggestions?

@PhilipOakley
Copy link
Author

Closing this PR in favour of #318 which was the result of trying to edit this PR.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.