Original author(s) | Linus Torvalds [1] |
---|---|
Developer(s) | Junio Hamano and others [2] |
Initial release | 7 April 2005 |
Stable release | 2.44.0
[3]
/ 23 February 2024 |
Repository | |
Written in | Primarily in C, with GUI and programming scripts written in Shell script, Perl, Tcl and Python [4] [5] |
Operating system | POSIX ( Linux, macOS, Solaris, AIX), Windows |
Available in | English |
Type | Version control |
License | GPL-2.0-only [i] [7] |
Website |
git-scm |
Git ( /ɡɪt/) [8] is a distributed version control system [9] that tracks changes in any set of computer files, usually used for coordinating work among programmers who are collaboratively developing source code during software development.
Git's goals include speed, data integrity, and support for distributed, non-linear workflows (thousands of parallel branches running on different computers). [10] [11] [12] Git was originally authored by Linus Torvalds in 2005 for the development of the Linux kernel, with other kernel developers contributing to its initial development. [13] It was prompted by the revocation of the free license of BitKeeper, the proprietary source-control management system used for Linux kernel development since 2002. Since 2005, Junio Hamano has been the core maintainer of Git. As with most other distributed version control systems, and unlike most client–server systems, every Git directory on every computer is a full-fledged repository with complete history and full version-tracking abilities, independent of network access or a central server. [14] Git is a free and open-source software shared under the GPL-2.0-only license.
Git's design benefits from Torvalds' experience with Linux and file-system performance, leading to features such as support for non-linear development, efficient handling of large projects, and cryptographic authentication of history. Its toolkit-based design allows for pluggable merge strategies and flexibility in managing version control tasks. Despite its comprehensive feature set, Git has faced security challenges, leading to updates and patches that address vulnerabilities. The trademark "Git" is registered by the Software Freedom Conservancy, marking its official recognition and continued evolution in the open-source community.
Git's adoption has grown rapidly, becoming the most popular distributed version control system, with nearly 95% of developers reporting it as their primary version control system as of 2022. [15] It is the most widely used source-code management tool among professional developers. There are offerings of Git repository services, including GitHub, SourceForge, Bitbucket and GitLab. [16] [17] [18] [19] [20]
Git development was started by Torvalds in April 2005 when the proprietary source-control management (SCM) system used for Linux kernel development since 2002, BitKeeper, revoked its free license for Linux development. [21] [22] The copyright holder of BitKeeper, Larry McVoy, claimed that Andrew Tridgell had created SourcePuller by reverse engineering the BitKeeper protocols. [23] The same incident also spurred the creation of another version-control system, Mercurial.
Torvalds wanted a distributed system that he could use like BitKeeper, but none of the available free systems met his needs. He cited an example of a source-control management system needing 30 seconds to apply a patch and update all associated metadata, and noted that this would not scale to the needs of Linux kernel development, where synchronizing with fellow maintainers could require 250 such actions at once. For his design criterion, he specified that patching should take no more than three seconds, and added three more goals: [10]
These criteria eliminated every version-control system in use at the time, so immediately after the 2.6.12-rc2 Linux kernel development release, Torvalds set out to write his own. [12]
The development of Git began on 3 April 2005. [24] Torvalds announced the project on 6 April and became self-hosting the next day. [24] [25] The first merge of multiple branches took place on 18 April. [26] Torvalds achieved his performance goals; on 29 April, the nascent Git was benchmarked recording patches to the Linux kernel tree at a rate of 6.7 patches per second. [27] On 16 June, Git managed the kernel 2.6.12 release. [28]
Torvalds turned over maintenance on 26 July 2005 to Junio Hamano, a major contributor to the project. [29] Hamano was responsible for the 1.0 release on 21 December 2005. [30]
Torvalds sarcastically quipped about the name git (which means "unpleasant person" in British English slang): "I'm an egotistical bastard, and I name all my projects after myself. First ' Linux', now 'git'." [31] [32] The man page describes Git as "the stupid content tracker". [33]
The read-me file of the source code elaborates further: [34]
"git" can mean anything, depending on your mood.
- Random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
- Stupid. Contemptible and despicable. Simple. Take your pick from the dictionary of slang.
- "Global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
- "Goddamn idiotic truckload of sh*t": when it breaks.
The source code for Git refers to the program as "the information manager from hell".
Git's design is a synthesis of Torvalds's experience with Linux in maintaining a large distributed development project, along with his intimate knowledge of file-system performance gained from the same project and the urgent need to produce a working system in short order. These influences led to the following implementation choices: [13]
git gc
.
[44]
[45]git gc
command.
[46] For data integrity, both the packfile and its index have an
SHA-1 checksum
[47] inside, and the file name of the packfile also contains an SHA-1 checksum. To check the integrity of a repository, run the git fsck
command.
[48]
[49]Another property of Git is that it snapshots directory trees of files. The earliest systems for tracking versions of source code, Source Code Control System (SCCS) and Revision Control System (RCS), worked on individual files and emphasized the space savings to be gained from interleaved deltas (SCCS) or delta encoding (RCS) the (mostly similar) versions. Later revision-control systems maintained this notion of a file having an identity across multiple revisions of a project. However, Torvalds rejected this concept. [50] Consequently, Git does not explicitly record file revision relationships at any level below the source-code tree.
These implicit revision relationships have some significant consequences:
Git implements several merging strategies; a non-default strategy can be selected at merge time: [56]
When there are more than one common ancestors that can be used for a three-way merge, it creates a merged tree of the common ancestors and uses that as the reference tree for the three-way merge. This has been reported to result in fewer merge conflicts without causing mis-merges by tests done on prior merge commits taken from Linux 2.6 kernel development history. Also, this can detect and handle merges involving renames.
— Linus Torvalds [57]
Git's primitives are not inherently a source-code management system. Torvalds explains: [58]
In many ways you can just see git as a filesystem—it's content-addressable, and it has a notion of versioning, but I really designed it coming at the problem from the viewpoint of a filesystem person (hey, kernels is what I do), and I actually have absolutely zero interest in creating a traditional SCM system.
From this initial design approach, Git has developed the full set of features expected of a traditional SCM, [59] with features mostly being created as needed, then refined and extended over time.
Git has two data structures: a mutable index (also called stage or cache) that caches information about the working directory and the next revision to be committed; and an immutable, append-only object database.
The index serves as a connection point between the object database and the working tree.
The object store contains five types of objects: [60] [48]
Each object is identified by a SHA-1 hash of its contents. Git computes the hash and uses this value for the object's name. The object is put into a directory matching the first two characters of its hash. The rest of the hash is used as the file name for that object.
Git stores each revision of a file as a unique blob. The relationships between the blobs can be found through examining the tree and commit objects. Newly added objects are stored in their entirety using zlib compression. This can consume a large amount of disk space quickly, so objects can be combined into packs, which use delta compression to save space, storing blobs as their changes relative to other blobs.
Additionally, Git stores labels called refs (short for references) to indicate the locations of various commits. They are stored in the reference database and are respectively: [66]
Every object in the Git database that is not referred to may be cleaned up by using a garbage collection command or automatically. An object may be referenced by another object or an explicit reference. Git knows different types of references. The commands to create, move, and delete references vary. git show-ref
lists all references. Some types are:
Git (the main implementation in C) is primarily developed on Linux, although it also supports most major operating systems, including the BSDs ( DragonFly BSD, FreeBSD, NetBSD, and OpenBSD), Solaris, macOS, and Windows. [68] [69]
The first Windows port of Git was primarily a Linux-emulation framework that hosts the Linux version. Installing Git under Windows creates a similarly named Program Files directory containing the Mingw-w64 port of the GNU Compiler Collection, Perl 5, MSYS2 (itself a fork of Cygwin, a Unix-like emulation environment for Windows) and various other Windows ports or emulations of Linux utilities and libraries. Currently, native Windows builds of Git are distributed as 32- and 64-bit installers. [70] The git official website currently maintains a build of Git for Windows, still using the MSYS2 environment. [71]
The JGit implementation of Git is a pure Java software library, designed to be embedded in any Java application. JGit is used in the Gerrit code-review tool, and in EGit, a Git client for the Eclipse IDE. [72]
Go-git is an open-source implementation of Git written in pure Go. [73] It is currently used for backing projects as a SQL interface for Git code repositories [74] and providing encryption for Git. [75]
The Dulwich implementation of Git is a pure Python software component for Python 2.7, 3.4 and 3.5. [76]
The libgit2 implementation of Git is an ANSI C software library with no other dependencies, which can be built on multiple platforms, including Windows, Linux, macOS, and BSD. [77] It has bindings for many programming languages, including Ruby, Python, and Haskell. [78] [79] [80]
JS-Git is a JavaScript implementation of a subset of Git. [81]
GameOfTrees is an open-source implementation of Git for the OpenBSD project. [82]
As Git is a distributed version control system, it could be used as a server out of the box. It is shipped with a built-in command git daemon
which starts a simple TCP server running on the Git protocol.
[83]
[84] Dedicated Git HTTP servers help (amongst other features) by adding access control, displaying the contents of a Git repository via the web interfaces, and managing multiple repositories. Already existing Git repositories can be cloned and shared to be used by others as a centralized repo. It can also be accessed via remote shell just by having the Git software installed and allowing a user to log in.
[85] Git servers typically listen on
TCP port 9418.
[86]
There are many offerings of Git repositories as a service. The most popular are GitHub, SourceForge, Bitbucket and GitLab. [91] [17] [18] [19] [20]
The Eclipse Foundation reported in its annual community survey that as of May 2014, Git is now the most widely used source-code management tool, with 42.9% of professional software developers reporting that they use Git as their primary source-control system [92] compared with 36.3% in 2013, 32% in 2012; or for Git responses excluding use of GitHub: 33.3% in 2014, 30.3% in 2013, 27.6% in 2012 and 12.8% in 2011. [93] Open-source directory Black Duck Open Hub reports a similar uptake among open-source projects. [94]
Stack Overflow has included version control in their annual developer survey [95] in 2015 (16,694 responses), [96] 2017 (30,730 responses), [97] 2018 (74,298 responses) [98] and 2022 (71,379 responses). [15] Git was the overwhelming favorite of responding developers in these surveys, reporting as high as 93.9% in 2022.
Version control systems used by responding developers:
Name | 2015 | 2017 | 2018 | 2022 |
---|---|---|---|---|
Git | 69.3% | 69.2% | 87.2% | 93.9% |
Subversion | 36.9% | 9.1% | 16.1% | 5.2% |
TFVC | 12.2% | 7.3% | 10.9% | [ii] |
Mercurial | 7.9% | 1.9% | 3.6% | 1.1% |
CVS | 4.2% | [ii] | [ii] | [ii] |
Perforce | 3.3% | [ii] | [ii] | [ii] |
VSS | [ii] | 0.6% | [ii] | [ii] |
IBM DevOps Code ClearCase | [ii] | 0.4% | [ii] | [ii] |
Zip file backups | [ii] | 2.0% | 7.9% | [ii] |
Raw network sharing | [ii] | 1.7% | 7.9% | [ii] |
Other | 5.8% | 3.0% | [ii] | [ii] |
None | 9.3% | 4.8% | 4.8% | 4.3% |
The UK IT jobs website itjobswatch.co.uk reports that as of late September 2016, 29.27% of UK permanent software development job openings have cited Git, [99] ahead of 12.17% for Microsoft Team Foundation Server, [100] 10.60% for Subversion, [101] 1.30% for Mercurial, [102] and 0.48% for Visual SourceSafe. [103]
There are many Git extensions, like Git LFS, which started as an extension to Git in the GitHub community and is now widely used by other repositories. Extensions are usually independently developed and maintained by different people, but at some point in the future, a widely used extension can be merged with Git.
Other open-source Git extensions include:
Microsoft developed the Virtual File System for Git (VFS for Git; formerly Git Virtual File System or GVFS) extension to handle the size of the Windows source-code tree as part of their 2017 migration from Perforce. VFS for Git allows cloned repositories to use placeholders whose contents are downloaded only once a file is accessed. [104]
Git does not impose many restrictions on how it should be used, but some conventions are adopted in order to organize histories, especially those which require the cooperation of many contributors.
Git does not provide access-control mechanisms, but was designed for operation with other tools that specialize in access control. [114]
On 17 December 2014, an exploit was found affecting the Windows and macOS versions of the Git client. An attacker could perform arbitrary code execution on a target computer with Git installed by creating a malicious Git tree (directory) named .git (a directory in Git repositories that stores all the data of the repository) in a different case (such as .GIT or .Git, needed because Git does not allow the all-lowercase version of .git to be created manually) with malicious files in the .git/hooks subdirectory (a folder with executable files that Git runs) on a repository that the attacker made or on a repository that the attacker can modify. If a Windows or Mac user pulls (downloads) a version of the repository with the malicious directory, then switches to that directory, the .git directory will be overwritten (due to the case-insensitive trait of the Windows and Mac filesystems) and the malicious executable files in .git/hooks may be run, which results in the attacker's commands being executed. An attacker could also modify the .git/config configuration file, which allows the attacker to create malicious Git aliases (aliases for Git commands or external commands) or modify extant aliases to execute malicious commands when run. The vulnerability was patched in version 2.2.1 of Git, released on 17 December 2014, and announced the next day. [115] [116]
Git version 2.6.1, released on 29 September 2015, contained a patch for a security vulnerability ( CVE- 2015-7545) [117] that allowed arbitrary code execution. [118] The vulnerability was exploitable if an attacker could convince a victim to clone a specific URL, as the arbitrary commands were embedded in the URL itself. [119] An attacker could use the exploit via a man-in-the-middle attack if the connection was unencrypted, [119] as they could redirect the user to a URL of their choice. Recursive clones were also vulnerable since they allowed the controller of a repository to specify arbitrary URLs via the gitmodules file. [119]
Git uses SHA-1 hashes internally. Linus Torvalds has responded that the hash was mostly to guard against accidental corruption, and the security a cryptographically secure hash gives was just an accidental side effect, with the main security being signing elsewhere. [120] [121] Since a demonstration of the SHAttered attack against git in 2017, git was modified to use a SHA-1 variant resistant to this attack. A plan for hash function transition is being written since February 2020. [122]
"Git" is a registered word trademark of Software Freedom Conservancy under US500000085961336 since 2015-02-03.
Torvalds seemed aware that his decision to drop BitKeeper would also be controversial. When asked why he called the new software, 'git', British slang meaning 'a rotten person', he said. 'I'm an egotistical bastard, so I name all my projects after myself. First Linux, now git.'
git-blame
to show code moved between source files.
You may remember when Git introduced a new version of its network fetch protocol way back in 2018. That protocol is now used by default in 2.26, so let's refresh ourselves on what that means. The biggest problem with the old protocol is that the server would immediately list all of the branches, tags, and other references in the repository before the client had a chance to send anything. For some repositories, this could mean sending megabytes of extra data, when the client really only wanted to know about the master branch. The new protocol starts with the client request and provides a way for the client to tell the server which references it's interested in. Fetching a single branch will only ask about that branch, while most clones will only ask about branches and tags. This might seem like everything, but server repositories may store other references (such as the head of every pull request opened in the repository since its creation). Now, fetches from large repositories improve in speed, especially when the fetch itself is small, which makes the cost of the initial reference advertisement more expensive relatively speaking. And the best part is that you won't need to do anything! Due to some clever design, any client that speaks the new protocol can work seamlessly with both old and new servers, falling back to the original protocol if the server doesn't support it. The only reason for the delay between introducing the protocol and making it the default was to let early adopters discover any bugs.
Stack Overflow's annual Developer Survey is the largest and most comprehensive survey of people who code around the world. Each year, we field a survey covering everything from developers' favorite technologies to their job preferences. This year marks the ninth year we've published our annual Developer Survey results, and nearly 90,000 developers took the 20-minute survey earlier this year.
The "master" branch in Git is not a special branch. It is exactly like any other branch. The only reason nearly every repository has one is that the git init command creates it by default and most people don't bother to change it.
Reverting has two important advantages over resetting. First, it doesn't change the project history, which makes it a "safe" operation for commits that have already been published to a shared repository.
Original author(s) | Linus Torvalds [1] |
---|---|
Developer(s) | Junio Hamano and others [2] |
Initial release | 7 April 2005 |
Stable release | 2.44.0
[3]
/ 23 February 2024 |
Repository | |
Written in | Primarily in C, with GUI and programming scripts written in Shell script, Perl, Tcl and Python [4] [5] |
Operating system | POSIX ( Linux, macOS, Solaris, AIX), Windows |
Available in | English |
Type | Version control |
License | GPL-2.0-only [i] [7] |
Website |
git-scm |
Git ( /ɡɪt/) [8] is a distributed version control system [9] that tracks changes in any set of computer files, usually used for coordinating work among programmers who are collaboratively developing source code during software development.
Git's goals include speed, data integrity, and support for distributed, non-linear workflows (thousands of parallel branches running on different computers). [10] [11] [12] Git was originally authored by Linus Torvalds in 2005 for the development of the Linux kernel, with other kernel developers contributing to its initial development. [13] It was prompted by the revocation of the free license of BitKeeper, the proprietary source-control management system used for Linux kernel development since 2002. Since 2005, Junio Hamano has been the core maintainer of Git. As with most other distributed version control systems, and unlike most client–server systems, every Git directory on every computer is a full-fledged repository with complete history and full version-tracking abilities, independent of network access or a central server. [14] Git is a free and open-source software shared under the GPL-2.0-only license.
Git's design benefits from Torvalds' experience with Linux and file-system performance, leading to features such as support for non-linear development, efficient handling of large projects, and cryptographic authentication of history. Its toolkit-based design allows for pluggable merge strategies and flexibility in managing version control tasks. Despite its comprehensive feature set, Git has faced security challenges, leading to updates and patches that address vulnerabilities. The trademark "Git" is registered by the Software Freedom Conservancy, marking its official recognition and continued evolution in the open-source community.
Git's adoption has grown rapidly, becoming the most popular distributed version control system, with nearly 95% of developers reporting it as their primary version control system as of 2022. [15] It is the most widely used source-code management tool among professional developers. There are offerings of Git repository services, including GitHub, SourceForge, Bitbucket and GitLab. [16] [17] [18] [19] [20]
Git development was started by Torvalds in April 2005 when the proprietary source-control management (SCM) system used for Linux kernel development since 2002, BitKeeper, revoked its free license for Linux development. [21] [22] The copyright holder of BitKeeper, Larry McVoy, claimed that Andrew Tridgell had created SourcePuller by reverse engineering the BitKeeper protocols. [23] The same incident also spurred the creation of another version-control system, Mercurial.
Torvalds wanted a distributed system that he could use like BitKeeper, but none of the available free systems met his needs. He cited an example of a source-control management system needing 30 seconds to apply a patch and update all associated metadata, and noted that this would not scale to the needs of Linux kernel development, where synchronizing with fellow maintainers could require 250 such actions at once. For his design criterion, he specified that patching should take no more than three seconds, and added three more goals: [10]
These criteria eliminated every version-control system in use at the time, so immediately after the 2.6.12-rc2 Linux kernel development release, Torvalds set out to write his own. [12]
The development of Git began on 3 April 2005. [24] Torvalds announced the project on 6 April and became self-hosting the next day. [24] [25] The first merge of multiple branches took place on 18 April. [26] Torvalds achieved his performance goals; on 29 April, the nascent Git was benchmarked recording patches to the Linux kernel tree at a rate of 6.7 patches per second. [27] On 16 June, Git managed the kernel 2.6.12 release. [28]
Torvalds turned over maintenance on 26 July 2005 to Junio Hamano, a major contributor to the project. [29] Hamano was responsible for the 1.0 release on 21 December 2005. [30]
Torvalds sarcastically quipped about the name git (which means "unpleasant person" in British English slang): "I'm an egotistical bastard, and I name all my projects after myself. First ' Linux', now 'git'." [31] [32] The man page describes Git as "the stupid content tracker". [33]
The read-me file of the source code elaborates further: [34]
"git" can mean anything, depending on your mood.
- Random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
- Stupid. Contemptible and despicable. Simple. Take your pick from the dictionary of slang.
- "Global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
- "Goddamn idiotic truckload of sh*t": when it breaks.
The source code for Git refers to the program as "the information manager from hell".
Git's design is a synthesis of Torvalds's experience with Linux in maintaining a large distributed development project, along with his intimate knowledge of file-system performance gained from the same project and the urgent need to produce a working system in short order. These influences led to the following implementation choices: [13]
git gc
.
[44]
[45]git gc
command.
[46] For data integrity, both the packfile and its index have an
SHA-1 checksum
[47] inside, and the file name of the packfile also contains an SHA-1 checksum. To check the integrity of a repository, run the git fsck
command.
[48]
[49]Another property of Git is that it snapshots directory trees of files. The earliest systems for tracking versions of source code, Source Code Control System (SCCS) and Revision Control System (RCS), worked on individual files and emphasized the space savings to be gained from interleaved deltas (SCCS) or delta encoding (RCS) the (mostly similar) versions. Later revision-control systems maintained this notion of a file having an identity across multiple revisions of a project. However, Torvalds rejected this concept. [50] Consequently, Git does not explicitly record file revision relationships at any level below the source-code tree.
These implicit revision relationships have some significant consequences:
Git implements several merging strategies; a non-default strategy can be selected at merge time: [56]
When there are more than one common ancestors that can be used for a three-way merge, it creates a merged tree of the common ancestors and uses that as the reference tree for the three-way merge. This has been reported to result in fewer merge conflicts without causing mis-merges by tests done on prior merge commits taken from Linux 2.6 kernel development history. Also, this can detect and handle merges involving renames.
— Linus Torvalds [57]
Git's primitives are not inherently a source-code management system. Torvalds explains: [58]
In many ways you can just see git as a filesystem—it's content-addressable, and it has a notion of versioning, but I really designed it coming at the problem from the viewpoint of a filesystem person (hey, kernels is what I do), and I actually have absolutely zero interest in creating a traditional SCM system.
From this initial design approach, Git has developed the full set of features expected of a traditional SCM, [59] with features mostly being created as needed, then refined and extended over time.
Git has two data structures: a mutable index (also called stage or cache) that caches information about the working directory and the next revision to be committed; and an immutable, append-only object database.
The index serves as a connection point between the object database and the working tree.
The object store contains five types of objects: [60] [48]
Each object is identified by a SHA-1 hash of its contents. Git computes the hash and uses this value for the object's name. The object is put into a directory matching the first two characters of its hash. The rest of the hash is used as the file name for that object.
Git stores each revision of a file as a unique blob. The relationships between the blobs can be found through examining the tree and commit objects. Newly added objects are stored in their entirety using zlib compression. This can consume a large amount of disk space quickly, so objects can be combined into packs, which use delta compression to save space, storing blobs as their changes relative to other blobs.
Additionally, Git stores labels called refs (short for references) to indicate the locations of various commits. They are stored in the reference database and are respectively: [66]
Every object in the Git database that is not referred to may be cleaned up by using a garbage collection command or automatically. An object may be referenced by another object or an explicit reference. Git knows different types of references. The commands to create, move, and delete references vary. git show-ref
lists all references. Some types are:
Git (the main implementation in C) is primarily developed on Linux, although it also supports most major operating systems, including the BSDs ( DragonFly BSD, FreeBSD, NetBSD, and OpenBSD), Solaris, macOS, and Windows. [68] [69]
The first Windows port of Git was primarily a Linux-emulation framework that hosts the Linux version. Installing Git under Windows creates a similarly named Program Files directory containing the Mingw-w64 port of the GNU Compiler Collection, Perl 5, MSYS2 (itself a fork of Cygwin, a Unix-like emulation environment for Windows) and various other Windows ports or emulations of Linux utilities and libraries. Currently, native Windows builds of Git are distributed as 32- and 64-bit installers. [70] The git official website currently maintains a build of Git for Windows, still using the MSYS2 environment. [71]
The JGit implementation of Git is a pure Java software library, designed to be embedded in any Java application. JGit is used in the Gerrit code-review tool, and in EGit, a Git client for the Eclipse IDE. [72]
Go-git is an open-source implementation of Git written in pure Go. [73] It is currently used for backing projects as a SQL interface for Git code repositories [74] and providing encryption for Git. [75]
The Dulwich implementation of Git is a pure Python software component for Python 2.7, 3.4 and 3.5. [76]
The libgit2 implementation of Git is an ANSI C software library with no other dependencies, which can be built on multiple platforms, including Windows, Linux, macOS, and BSD. [77] It has bindings for many programming languages, including Ruby, Python, and Haskell. [78] [79] [80]
JS-Git is a JavaScript implementation of a subset of Git. [81]
GameOfTrees is an open-source implementation of Git for the OpenBSD project. [82]
As Git is a distributed version control system, it could be used as a server out of the box. It is shipped with a built-in command git daemon
which starts a simple TCP server running on the Git protocol.
[83]
[84] Dedicated Git HTTP servers help (amongst other features) by adding access control, displaying the contents of a Git repository via the web interfaces, and managing multiple repositories. Already existing Git repositories can be cloned and shared to be used by others as a centralized repo. It can also be accessed via remote shell just by having the Git software installed and allowing a user to log in.
[85] Git servers typically listen on
TCP port 9418.
[86]
There are many offerings of Git repositories as a service. The most popular are GitHub, SourceForge, Bitbucket and GitLab. [91] [17] [18] [19] [20]
The Eclipse Foundation reported in its annual community survey that as of May 2014, Git is now the most widely used source-code management tool, with 42.9% of professional software developers reporting that they use Git as their primary source-control system [92] compared with 36.3% in 2013, 32% in 2012; or for Git responses excluding use of GitHub: 33.3% in 2014, 30.3% in 2013, 27.6% in 2012 and 12.8% in 2011. [93] Open-source directory Black Duck Open Hub reports a similar uptake among open-source projects. [94]
Stack Overflow has included version control in their annual developer survey [95] in 2015 (16,694 responses), [96] 2017 (30,730 responses), [97] 2018 (74,298 responses) [98] and 2022 (71,379 responses). [15] Git was the overwhelming favorite of responding developers in these surveys, reporting as high as 93.9% in 2022.
Version control systems used by responding developers:
Name | 2015 | 2017 | 2018 | 2022 |
---|---|---|---|---|
Git | 69.3% | 69.2% | 87.2% | 93.9% |
Subversion | 36.9% | 9.1% | 16.1% | 5.2% |
TFVC | 12.2% | 7.3% | 10.9% | [ii] |
Mercurial | 7.9% | 1.9% | 3.6% | 1.1% |
CVS | 4.2% | [ii] | [ii] | [ii] |
Perforce | 3.3% | [ii] | [ii] | [ii] |
VSS | [ii] | 0.6% | [ii] | [ii] |
IBM DevOps Code ClearCase | [ii] | 0.4% | [ii] | [ii] |
Zip file backups | [ii] | 2.0% | 7.9% | [ii] |
Raw network sharing | [ii] | 1.7% | 7.9% | [ii] |
Other | 5.8% | 3.0% | [ii] | [ii] |
None | 9.3% | 4.8% | 4.8% | 4.3% |
The UK IT jobs website itjobswatch.co.uk reports that as of late September 2016, 29.27% of UK permanent software development job openings have cited Git, [99] ahead of 12.17% for Microsoft Team Foundation Server, [100] 10.60% for Subversion, [101] 1.30% for Mercurial, [102] and 0.48% for Visual SourceSafe. [103]
There are many Git extensions, like Git LFS, which started as an extension to Git in the GitHub community and is now widely used by other repositories. Extensions are usually independently developed and maintained by different people, but at some point in the future, a widely used extension can be merged with Git.
Other open-source Git extensions include:
Microsoft developed the Virtual File System for Git (VFS for Git; formerly Git Virtual File System or GVFS) extension to handle the size of the Windows source-code tree as part of their 2017 migration from Perforce. VFS for Git allows cloned repositories to use placeholders whose contents are downloaded only once a file is accessed. [104]
Git does not impose many restrictions on how it should be used, but some conventions are adopted in order to organize histories, especially those which require the cooperation of many contributors.
Git does not provide access-control mechanisms, but was designed for operation with other tools that specialize in access control. [114]
On 17 December 2014, an exploit was found affecting the Windows and macOS versions of the Git client. An attacker could perform arbitrary code execution on a target computer with Git installed by creating a malicious Git tree (directory) named .git (a directory in Git repositories that stores all the data of the repository) in a different case (such as .GIT or .Git, needed because Git does not allow the all-lowercase version of .git to be created manually) with malicious files in the .git/hooks subdirectory (a folder with executable files that Git runs) on a repository that the attacker made or on a repository that the attacker can modify. If a Windows or Mac user pulls (downloads) a version of the repository with the malicious directory, then switches to that directory, the .git directory will be overwritten (due to the case-insensitive trait of the Windows and Mac filesystems) and the malicious executable files in .git/hooks may be run, which results in the attacker's commands being executed. An attacker could also modify the .git/config configuration file, which allows the attacker to create malicious Git aliases (aliases for Git commands or external commands) or modify extant aliases to execute malicious commands when run. The vulnerability was patched in version 2.2.1 of Git, released on 17 December 2014, and announced the next day. [115] [116]
Git version 2.6.1, released on 29 September 2015, contained a patch for a security vulnerability ( CVE- 2015-7545) [117] that allowed arbitrary code execution. [118] The vulnerability was exploitable if an attacker could convince a victim to clone a specific URL, as the arbitrary commands were embedded in the URL itself. [119] An attacker could use the exploit via a man-in-the-middle attack if the connection was unencrypted, [119] as they could redirect the user to a URL of their choice. Recursive clones were also vulnerable since they allowed the controller of a repository to specify arbitrary URLs via the gitmodules file. [119]
Git uses SHA-1 hashes internally. Linus Torvalds has responded that the hash was mostly to guard against accidental corruption, and the security a cryptographically secure hash gives was just an accidental side effect, with the main security being signing elsewhere. [120] [121] Since a demonstration of the SHAttered attack against git in 2017, git was modified to use a SHA-1 variant resistant to this attack. A plan for hash function transition is being written since February 2020. [122]
"Git" is a registered word trademark of Software Freedom Conservancy under US500000085961336 since 2015-02-03.
Torvalds seemed aware that his decision to drop BitKeeper would also be controversial. When asked why he called the new software, 'git', British slang meaning 'a rotten person', he said. 'I'm an egotistical bastard, so I name all my projects after myself. First Linux, now git.'
git-blame
to show code moved between source files.
You may remember when Git introduced a new version of its network fetch protocol way back in 2018. That protocol is now used by default in 2.26, so let's refresh ourselves on what that means. The biggest problem with the old protocol is that the server would immediately list all of the branches, tags, and other references in the repository before the client had a chance to send anything. For some repositories, this could mean sending megabytes of extra data, when the client really only wanted to know about the master branch. The new protocol starts with the client request and provides a way for the client to tell the server which references it's interested in. Fetching a single branch will only ask about that branch, while most clones will only ask about branches and tags. This might seem like everything, but server repositories may store other references (such as the head of every pull request opened in the repository since its creation). Now, fetches from large repositories improve in speed, especially when the fetch itself is small, which makes the cost of the initial reference advertisement more expensive relatively speaking. And the best part is that you won't need to do anything! Due to some clever design, any client that speaks the new protocol can work seamlessly with both old and new servers, falling back to the original protocol if the server doesn't support it. The only reason for the delay between introducing the protocol and making it the default was to let early adopters discover any bugs.
Stack Overflow's annual Developer Survey is the largest and most comprehensive survey of people who code around the world. Each year, we field a survey covering everything from developers' favorite technologies to their job preferences. This year marks the ninth year we've published our annual Developer Survey results, and nearly 90,000 developers took the 20-minute survey earlier this year.
The "master" branch in Git is not a special branch. It is exactly like any other branch. The only reason nearly every repository has one is that the git init command creates it by default and most people don't bother to change it.
Reverting has two important advantages over resetting. First, it doesn't change the project history, which makes it a "safe" operation for commits that have already been published to a shared repository.