From f5e991bee01104342ade57d8f2ea51527190c50d Mon Sep 17 00:00:00 2001
From: Gabriel Majeri <gabriel.majeri6@gmail.com>
Date: Sun, 9 Sep 2018 14:22:58 +0300
Subject: [PATCH 1/6] Expand the documentation for the std::sync module

Provides an overview on why synchronization is required,
as well a short summary of what sync primitives are available.
---
 src/libstd/sync/mod.rs | 123 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 119 insertions(+), 4 deletions(-)

diff --git a/src/libstd/sync/mod.rs b/src/libstd/sync/mod.rs
index e12ef8d9eda2d..e06e299406933 100644
--- a/src/libstd/sync/mod.rs
+++ b/src/libstd/sync/mod.rs
@@ -10,10 +10,125 @@
 
 //! Useful synchronization primitives.
 //!
-//! This module contains useful safe and unsafe synchronization primitives.
-//! Most of the primitives in this module do not provide any sort of locking
-//! and/or blocking at all, but rather provide the necessary tools to build
-//! other types of concurrent primitives.
+//! ## The need for synchronization
+//!
+//! On an ideal single-core CPU, the timeline of events happening in a program
+//! is linear, consistent with the order of operations in the code.
+//!
+//! Considering the following code, operating on some global static variables:
+//!
+//! ```rust
+//! # static mut A: u32 = 0;
+//! # static mut B: u32 = 0;
+//! # static mut C: u32 = 0;
+//! # unsafe {
+//! A = 3;
+//! B = 4;
+//! A = A + B;
+//! C = B;
+//! println!("{} {} {}", A, B, C);
+//! C = A;
+//! # }
+//! ```
+//!
+//! It appears _as if_ some variables stored in memory are changed, an addition
+//! is performed, result is stored in A and the variable C is modified twice.
+//! When only a single thread is involved, the results are as expected:
+//! the line `7 4 4` gets printed.
+//!
+//! As for what happens behind the scenes, when an optimizing compiler is used
+//! the final generated machine code might look very different from the code:
+//!
+//! - first store to `C` might be moved before the store to `A` or `B`,
+//!   _as if_ we had written `C = 4; A = 3; B = 4;`
+//!
+//! - last store to `C` might be removed, since we never read from it again.
+//!
+//! - assignment of `A + B` to `A` might be removed, the sum can be stored in a
+//!   in a register until it gets printed, and the global variable never gets
+//!   updated.
+//!
+//! - the final result could be determined just by looking at the code at compile time,
+//!   so [constant folding] might turn the whole block into a simple `println!("7 4 4")`
+//!
+//! The compiler is allowed to perform any combination of these optimizations, as long
+//! as the final optimized code, when executed, produces the same results as the one
+//! without optimizations.
+//!
+//! When multiprocessing is involved (either multiple CPU cores, or multiple
+//! physical CPUs), access to global variables (which are shared between threads)
+//! could lead to nondeterministic results, **even if** compiler optimizations
+//! are disabled.
+//!
+//! Note that thanks to Rust's safety guarantees, accessing global (static)
+//! variables requires `unsafe` code, assuming we don't use any of the
+//! synchronization primitives in this module.
+//!
+//! [constant folding]: https://en.wikipedia.org/wiki/Constant_folding
+//!
+//! ## Out-of-order execution
+//!
+//! Instructions can execute in a different order from the one we define, due to
+//! various reasons:
+//!
+//! - **Compiler** reordering instructions: if the compiler can issue an
+//!   instruction at an earlier point, it will try to do so. For example, it
+//!   might hoist memory loads at the top of a code block, so that the CPU can
+//!   start [prefetching] the values from memory.
+//!
+//!   In single-threaded scenarios, this can cause issues when writing signal handlers
+//!   or certain kinds of low-level code.
+//!   Use [compiler fences] to prevent this reordering.
+//!
+//! - **Single processor** executing instructions [out-of-order]: modern CPUs are
+//!   capable of [superscalar] execution, i.e. multiple instructions might be
+//!   executing at the same time, even though the machine code describes a
+//!   sequential process.
+//!
+//!   This kind of reordering is handled transparently by the CPU.
+//!
+//! - **Multiprocessor** system, where multiple hardware threads run at the same time.
+//!   In multi-threaded scenarios, you can use two kinds of primitives to deal
+//!   with synchronization:
+//!   - [memory fences] to ensure memory accesses are made visibile to other
+//!     CPUs in the right order.
+//!   - [atomic operations] to ensure simultaneous access to the same memory
+//!     location doesn't lead to undefined behavior.
+//!
+//! [prefetching]: https://en.wikipedia.org/wiki/Cache_prefetching
+//! [compiler fences]: atomic::compiler_fence
+//! [out-of-order]: https://en.wikipedia.org/wiki/Out-of-order_execution
+//! [superscalar]: https://en.wikipedia.org/wiki/Superscalar_processor
+//! [memory fences]: atomic::fence
+//! [atomics operations]: atomic
+//!
+//! ## Higher-level synchronization objects
+//!
+//! Most of the low-level synchronization primitives are quite error-prone and
+//! inconvenient to use, which is why the standard library also exposes some
+//! higher-level synchronization objects.
+//!
+//! These abstractions can be built out of lower-level primitives. For efficiency,
+//! the sync objects in the standard library are usually implemented with help
+//! from the operating system's kernel, which is able to reschedule the threads
+//! while they are blocked on acquiring a lock.
+//!
+//! ## Efficiency
+//!
+//! Higher-level synchronization mechanisms are usually heavy-weight.
+//! While most atomic operations can execute instantaneously, acquiring a
+//! [`Mutex`] can involve blocking until another thread releases it.
+//! For [`RwLock`], while! any number of readers may acquire it without
+//! blocking, each writer will have exclusive access.
+//!
+//! On the other hand, communication over [channels] can provide a fairly
+//! high-level interface without sacrificing performance, at the cost of
+//! somewhat more memory.
+//!
+//! The more synchronization exists between CPUs, the smaller the performance
+//! gains from multithreading will be.
+//!
+//! [channels]: mpsc
 
 #![stable(feature = "rust1", since = "1.0.0")]
 

From e0df0ae734ec97ad7cc67cf6bed0d142275571b9 Mon Sep 17 00:00:00 2001
From: Gabriel Majeri <gabriel.majeri6@gmail.com>
Date: Sun, 16 Sep 2018 12:56:44 +0300
Subject: [PATCH 2/6] Make example code use global variables

Because `fn main()` was added automatically, the variables
were actually local statics.
---
 src/libstd/sync/mod.rs | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/src/libstd/sync/mod.rs b/src/libstd/sync/mod.rs
index e06e299406933..df153561b4b16 100644
--- a/src/libstd/sync/mod.rs
+++ b/src/libstd/sync/mod.rs
@@ -18,17 +18,20 @@
 //! Considering the following code, operating on some global static variables:
 //!
 //! ```rust
-//! # static mut A: u32 = 0;
-//! # static mut B: u32 = 0;
-//! # static mut C: u32 = 0;
-//! # unsafe {
-//! A = 3;
-//! B = 4;
-//! A = A + B;
-//! C = B;
-//! println!("{} {} {}", A, B, C);
-//! C = A;
-//! # }
+//! static mut A: u32 = 0;
+//! static mut B: u32 = 0;
+//! static mut C: u32 = 0;
+//!
+//! fn main() {
+//!     unsafe {
+//!         A = 3;
+//!         B = 4;
+//!         A = A + B;
+//!         C = B;
+//!         println!("{} {} {}", A, B, C);
+//!         C = A;
+//!     }
+//! }
 //! ```
 //!
 //! It appears _as if_ some variables stored in memory are changed, an addition
@@ -42,8 +45,6 @@
 //! - first store to `C` might be moved before the store to `A` or `B`,
 //!   _as if_ we had written `C = 4; A = 3; B = 4;`
 //!
-//! - last store to `C` might be removed, since we never read from it again.
-//!
 //! - assignment of `A + B` to `A` might be removed, the sum can be stored in a
 //!   in a register until it gets printed, and the global variable never gets
 //!   updated.

From f3fdbbfae8646be30d7a19db059b9cdc42fadbc4 Mon Sep 17 00:00:00 2001
From: Gabriel Majeri <gabriel.majeri6@gmail.com>
Date: Thu, 27 Sep 2018 20:25:04 +0300
Subject: [PATCH 3/6] Address review comments

Reword the lead paragraph and turn the list items into
complete sentences.
---
 src/libstd/sync/mod.rs | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/src/libstd/sync/mod.rs b/src/libstd/sync/mod.rs
index df153561b4b16..bdb6e49aabc2a 100644
--- a/src/libstd/sync/mod.rs
+++ b/src/libstd/sync/mod.rs
@@ -12,8 +12,9 @@
 //!
 //! ## The need for synchronization
 //!
-//! On an ideal single-core CPU, the timeline of events happening in a program
-//! is linear, consistent with the order of operations in the code.
+//! Conceptually, a Rust program is simply a series of operations which will
+//! be executed on a computer. The timeline of events happening in the program
+//! is consistent with the order of the operations in the code.
 //!
 //! Considering the following code, operating on some global static variables:
 //!
@@ -35,22 +36,22 @@
 //! ```
 //!
 //! It appears _as if_ some variables stored in memory are changed, an addition
-//! is performed, result is stored in A and the variable C is modified twice.
+//! is performed, result is stored in `A` and the variable `C` is modified twice.
 //! When only a single thread is involved, the results are as expected:
 //! the line `7 4 4` gets printed.
 //!
-//! As for what happens behind the scenes, when an optimizing compiler is used
-//! the final generated machine code might look very different from the code:
+//! As for what happens behind the scenes, when optimizations are enabled the
+//! final generated machine code might look very different from the code:
 //!
-//! - first store to `C` might be moved before the store to `A` or `B`,
-//!   _as if_ we had written `C = 4; A = 3; B = 4;`
+//! - The first store to `C` might be moved before the store to `A` or `B`,
+//!   _as if_ we had written `C = 4; A = 3; B = 4`.
 //!
-//! - assignment of `A + B` to `A` might be removed, the sum can be stored in a
-//!   in a register until it gets printed, and the global variable never gets
-//!   updated.
+//! - Assignment of `A + B` to `A` might be removed, since the sum can be stored
+//!   in a temporary location until it gets printed, with the global variable
+//!   never getting updated.
 //!
-//! - the final result could be determined just by looking at the code at compile time,
-//!   so [constant folding] might turn the whole block into a simple `println!("7 4 4")`
+//! - The final result could be determined just by looking at the code at compile time,
+//!   so [constant folding] might turn the whole block into a simple `println!("7 4 4")`.
 //!
 //! The compiler is allowed to perform any combination of these optimizations, as long
 //! as the final optimized code, when executed, produces the same results as the one
@@ -77,8 +78,8 @@
 //!   might hoist memory loads at the top of a code block, so that the CPU can
 //!   start [prefetching] the values from memory.
 //!
-//!   In single-threaded scenarios, this can cause issues when writing signal handlers
-//!   or certain kinds of low-level code.
+//!   In single-threaded scenarios, this can cause issues when writing
+//!   signal handlers or certain kinds of low-level code.
 //!   Use [compiler fences] to prevent this reordering.
 //!
 //! - **Single processor** executing instructions [out-of-order]: modern CPUs are

From bcec6bb525032b48d8d1793854f61892c21fe8af Mon Sep 17 00:00:00 2001
From: Gabriel Majeri <gabriel.majeri6@gmail.com>
Date: Thu, 27 Sep 2018 22:12:09 +0300
Subject: [PATCH 4/6] Fix broken links

---
 src/libstd/sync/mod.rs | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/libstd/sync/mod.rs b/src/libstd/sync/mod.rs
index bdb6e49aabc2a..5ba569bf7ce5e 100644
--- a/src/libstd/sync/mod.rs
+++ b/src/libstd/sync/mod.rs
@@ -98,11 +98,11 @@
 //!     location doesn't lead to undefined behavior.
 //!
 //! [prefetching]: https://en.wikipedia.org/wiki/Cache_prefetching
-//! [compiler fences]: atomic::compiler_fence
+//! [compiler fences]: crate::sync::atomic::compiler_fence
 //! [out-of-order]: https://en.wikipedia.org/wiki/Out-of-order_execution
 //! [superscalar]: https://en.wikipedia.org/wiki/Superscalar_processor
-//! [memory fences]: atomic::fence
-//! [atomics operations]: atomic
+//! [memory fences]: crate::sync::atomic::fence
+//! [atomic operations]: crate::sync::atomic
 //!
 //! ## Higher-level synchronization objects
 //!
@@ -120,7 +120,7 @@
 //! Higher-level synchronization mechanisms are usually heavy-weight.
 //! While most atomic operations can execute instantaneously, acquiring a
 //! [`Mutex`] can involve blocking until another thread releases it.
-//! For [`RwLock`], while! any number of readers may acquire it without
+//! For [`RwLock`], while any number of readers may acquire it without
 //! blocking, each writer will have exclusive access.
 //!
 //! On the other hand, communication over [channels] can provide a fairly
@@ -130,7 +130,9 @@
 //! The more synchronization exists between CPUs, the smaller the performance
 //! gains from multithreading will be.
 //!
-//! [channels]: mpsc
+//! [`Mutex`]: crate::sync::Mutex
+//! [`RwLock`]: crate::sync::RwLock
+//! [channels]: crate::sync::mpsc
 
 #![stable(feature = "rust1", since = "1.0.0")]
 

From 7e921aa59090096593cb4fa202041c91a5d1e36b Mon Sep 17 00:00:00 2001
From: Gabriel Majeri <gabriel.majeri6@gmail.com>
Date: Fri, 28 Sep 2018 10:59:45 +0300
Subject: [PATCH 5/6] Rewrite section on concurrency

---
 src/libstd/sync/mod.rs | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/libstd/sync/mod.rs b/src/libstd/sync/mod.rs
index 5ba569bf7ce5e..edbed430e3866 100644
--- a/src/libstd/sync/mod.rs
+++ b/src/libstd/sync/mod.rs
@@ -57,16 +57,17 @@
 //! as the final optimized code, when executed, produces the same results as the one
 //! without optimizations.
 //!
-//! When multiprocessing is involved (either multiple CPU cores, or multiple
-//! physical CPUs), access to global variables (which are shared between threads)
-//! could lead to nondeterministic results, **even if** compiler optimizations
-//! are disabled.
+//! Due to the [concurrency] involved in modern computers, assumptions about
+//! the program's execution order are often wrong. Access to global variables
+//! can lead to nondeterministic results, **even if** compiler optimizations
+//! are disabled, and it is **still possible** to introduce synchronization bugs.
 //!
 //! Note that thanks to Rust's safety guarantees, accessing global (static)
 //! variables requires `unsafe` code, assuming we don't use any of the
 //! synchronization primitives in this module.
 //!
 //! [constant folding]: https://en.wikipedia.org/wiki/Constant_folding
+//! [concurrency]: https://en.wikipedia.org/wiki/Concurrency_(computer_science)
 //!
 //! ## Out-of-order execution
 //!

From 6ba55847129e9a35b477e43b7a381ca00fd2a339 Mon Sep 17 00:00:00 2001
From: Gabriel Majeri <gabriel.majeri6@gmail.com>
Date: Fri, 5 Oct 2018 08:50:17 +0300
Subject: [PATCH 6/6] Address review comments

---
 src/libstd/sync/mod.rs | 110 +++++++++++++++++++++++++----------------
 1 file changed, 67 insertions(+), 43 deletions(-)

diff --git a/src/libstd/sync/mod.rs b/src/libstd/sync/mod.rs
index edbed430e3866..d69ebc1762272 100644
--- a/src/libstd/sync/mod.rs
+++ b/src/libstd/sync/mod.rs
@@ -12,11 +12,11 @@
 //!
 //! ## The need for synchronization
 //!
-//! Conceptually, a Rust program is simply a series of operations which will
-//! be executed on a computer. The timeline of events happening in the program
-//! is consistent with the order of the operations in the code.
+//! Conceptually, a Rust program is a series of operations which will
+//! be executed on a computer. The timeline of events happening in the
+//! program is consistent with the order of the operations in the code.
 //!
-//! Considering the following code, operating on some global static variables:
+//! Consider the following code, operating on some global static variables:
 //!
 //! ```rust
 //! static mut A: u32 = 0;
@@ -35,8 +35,10 @@
 //! }
 //! ```
 //!
-//! It appears _as if_ some variables stored in memory are changed, an addition
-//! is performed, result is stored in `A` and the variable `C` is modified twice.
+//! It appears as if some variables stored in memory are changed, an addition
+//! is performed, result is stored in `A` and the variable `C` is
+//! modified twice.
+//!
 //! When only a single thread is involved, the results are as expected:
 //! the line `7 4 4` gets printed.
 //!
@@ -50,17 +52,19 @@
 //!   in a temporary location until it gets printed, with the global variable
 //!   never getting updated.
 //!
-//! - The final result could be determined just by looking at the code at compile time,
-//!   so [constant folding] might turn the whole block into a simple `println!("7 4 4")`.
+//! - The final result could be determined just by looking at the code
+//!   at compile time, so [constant folding] might turn the whole
+//!   block into a simple `println!("7 4 4")`.
 //!
-//! The compiler is allowed to perform any combination of these optimizations, as long
-//! as the final optimized code, when executed, produces the same results as the one
-//! without optimizations.
+//! The compiler is allowed to perform any combination of these
+//! optimizations, as long as the final optimized code, when executed,
+//! produces the same results as the one without optimizations.
 //!
-//! Due to the [concurrency] involved in modern computers, assumptions about
-//! the program's execution order are often wrong. Access to global variables
-//! can lead to nondeterministic results, **even if** compiler optimizations
-//! are disabled, and it is **still possible** to introduce synchronization bugs.
+//! Due to the [concurrency] involved in modern computers, assumptions
+//! about the program's execution order are often wrong. Access to
+//! global variables can lead to nondeterministic results, **even if**
+//! compiler optimizations are disabled, and it is **still possible**
+//! to introduce synchronization bugs.
 //!
 //! Note that thanks to Rust's safety guarantees, accessing global (static)
 //! variables requires `unsafe` code, assuming we don't use any of the
@@ -74,7 +78,7 @@
 //! Instructions can execute in a different order from the one we define, due to
 //! various reasons:
 //!
-//! - **Compiler** reordering instructions: if the compiler can issue an
+//! - The **compiler** reordering instructions: If the compiler can issue an
 //!   instruction at an earlier point, it will try to do so. For example, it
 //!   might hoist memory loads at the top of a code block, so that the CPU can
 //!   start [prefetching] the values from memory.
@@ -83,20 +87,20 @@
 //!   signal handlers or certain kinds of low-level code.
 //!   Use [compiler fences] to prevent this reordering.
 //!
-//! - **Single processor** executing instructions [out-of-order]: modern CPUs are
-//!   capable of [superscalar] execution, i.e. multiple instructions might be
-//!   executing at the same time, even though the machine code describes a
-//!   sequential process.
+//! - A **single processor** executing instructions [out-of-order]:
+//!   Modern CPUs are capable of [superscalar] execution,
+//!   i.e. multiple instructions might be executing at the same time,
+//!   even though the machine code describes a sequential process.
 //!
 //!   This kind of reordering is handled transparently by the CPU.
 //!
-//! - **Multiprocessor** system, where multiple hardware threads run at the same time.
-//!   In multi-threaded scenarios, you can use two kinds of primitives to deal
-//!   with synchronization:
-//!   - [memory fences] to ensure memory accesses are made visibile to other
-//!     CPUs in the right order.
-//!   - [atomic operations] to ensure simultaneous access to the same memory
-//!     location doesn't lead to undefined behavior.
+//! - A **multiprocessor** system executing multiple hardware threads
+//!   at the same time: In multi-threaded scenarios, you can use two
+//!   kinds of primitives to deal with synchronization:
+//!   - [memory fences] to ensure memory accesses are made visibile to
+//!   other CPUs in the right order.
+//!   - [atomic operations] to ensure simultaneous access to the same
+//!   memory location doesn't lead to undefined behavior.
 //!
 //! [prefetching]: https://en.wikipedia.org/wiki/Cache_prefetching
 //! [compiler fences]: crate::sync::atomic::compiler_fence
@@ -111,29 +115,49 @@
 //! inconvenient to use, which is why the standard library also exposes some
 //! higher-level synchronization objects.
 //!
-//! These abstractions can be built out of lower-level primitives. For efficiency,
-//! the sync objects in the standard library are usually implemented with help
-//! from the operating system's kernel, which is able to reschedule the threads
-//! while they are blocked on acquiring a lock.
+//! These abstractions can be built out of lower-level primitives.
+//! For efficiency, the sync objects in the standard library are usually
+//! implemented with help from the operating system's kernel, which is
+//! able to reschedule the threads while they are blocked on acquiring
+//! a lock.
+//!
+//! The following is an overview of the available synchronization
+//! objects:
+//!
+//! - [`Arc`]: Atomically Reference-Counted pointer, which can be used
+//!   in multithreaded environments to prolong the lifetime of some
+//!   data until all the threads have finished using it.
+//!
+//! - [`Barrier`]: Ensures multiple threads will wait for each other
+//!   to reach a point in the program, before continuing execution all
+//!   together.
+//!
+//! - [`Condvar`]: Condition Variable, providing the ability to block
+//!   a thread while waiting for an event to occur.
 //!
-//! ## Efficiency
+//! - [`mpsc`]: Multi-producer, single-consumer queues, used for
+//!   message-based communication. Can provide a lightweight
+//!   inter-thread synchronisation mechanism, at the cost of some
+//!   extra memory.
 //!
-//! Higher-level synchronization mechanisms are usually heavy-weight.
-//! While most atomic operations can execute instantaneously, acquiring a
-//! [`Mutex`] can involve blocking until another thread releases it.
-//! For [`RwLock`], while any number of readers may acquire it without
-//! blocking, each writer will have exclusive access.
+//! - [`Mutex`]: Mutual Exclusion mechanism, which ensures that at
+//!   most one thread at a time is able to access some data.
 //!
-//! On the other hand, communication over [channels] can provide a fairly
-//! high-level interface without sacrificing performance, at the cost of
-//! somewhat more memory.
+//! - [`Once`]: Used for thread-safe, one-time initialization of a
+//!   global variable.
 //!
-//! The more synchronization exists between CPUs, the smaller the performance
-//! gains from multithreading will be.
+//! - [`RwLock`]: Provides a mutual exclusion mechanism which allows
+//!   multiple readers at the same time, while allowing only one
+//!   writer at a time. In some cases, this can be more efficient than
+//!   a mutex.
 //!
+//! [`Arc`]: crate::sync::Arc
+//! [`Barrier`]: crate::sync::Barrier
+//! [`Condvar`]: crate::sync::Condvar
+//! [`mpsc`]: crate::sync::mpsc
 //! [`Mutex`]: crate::sync::Mutex
+//! [`Once`]: crate::sync::Once
 //! [`RwLock`]: crate::sync::RwLock
-//! [channels]: crate::sync::mpsc
 
 #![stable(feature = "rust1", since = "1.0.0")]