Repo created

2025-11-22 14:04:28 +01:00 · 2025-11-22 14:04:28 +01:00 · f8c34fa5ee
commit f8c34fa5ee
parent 81b91f4139
22732 changed files with 4815320 additions and 2 deletions
--- a/TMessagesProj/jni/voip/webrtc/base/containers/README.md
+++ b/TMessagesProj/jni/voip/webrtc/base/containers/README.md
@ -0,0 +1,363 @@
+# base/containers library
+
+[TOC]
+
+## What goes here
+
+This directory contains some STL-like containers.
+
+Things should be moved here that are generally applicable across the code base.
+Don't add things here just because you need them in one place and think others
+may someday want something similar. You can put specialized containers in
+your component's directory and we can promote them here later if we feel there
+is broad applicability.
+
+### Design and naming
+
+Fundamental [//base principles](../README.md#design-and-naming) apply, i.e.:
+
+Containers should adhere as closely to STL as possible. Functions and behaviors
+not present in STL should only be added when they are related to the specific
+data structure implemented by the container.
+
+For STL-like containers our policy is that they should use STL-like naming even
+when it may conflict with the style guide. So functions and class names should
+be lower case with underscores. Non-STL-like classes and functions should use
+Google naming. Be sure to use the base namespace.
+
+## Map and set selection
+
+### Usage advice
+
+*   Generally avoid `std::unordered_set` and `std::unordered_map`. In the common
+    case, query performance is unlikely to be sufficiently higher than
+    `std::map` to make a difference, insert performance is slightly worse, and
+    the memory overhead is high. This makes sense mostly for large tables where
+    you expect a lot of lookups.
+
+*   Most maps and sets in Chrome are small and contain objects that can be moved
+    efficiently. In this case, consider `base::flat_map` and `base::flat_set`.
+    You need to be aware of the maximum expected size of the container since
+    individual inserts and deletes are O(n), giving O(n^2) construction time for
+    the entire map. But because it avoids mallocs in most cases, inserts are
+    better or comparable to other containers even for several dozen items, and
+    efficiently-moved types are unlikely to have performance problems for most
+    cases until you have hundreds of items. If your container can be constructed
+    in one shot, the constructor from vector gives O(n log n) construction times
+    and it should be strictly better than a `std::map`.
+
+    Conceptually inserting a range of n elements into a `base::flat_map` or
+    `base::flat_set` behaves as if insert() was called for each individually
+    element. Thus in case the input range contains repeated elements, only the
+    first one of these duplicates will be inserted into the container. This
+    behaviour applies to construction from a range as well.
+
+*   `base::small_map` has better runtime memory usage without the poor mutation
+    performance of large containers that `base::flat_map` has. But this
+    advantage is partially offset by additional code size. Prefer in cases where
+    you make many objects so that the code/heap tradeoff is good.
+
+*   Use `std::map` and `std::set` if you can't decide. Even if they're not
+    great, they're unlikely to be bad or surprising.
+
+### Map and set details
+
+Sizes are on 64-bit platforms. Stable iterators aren't invalidated when the
+container is mutated.
+
+| Container                                  | Empty size            | Per-item overhead | Stable iterators? |
+|:------------------------------------------ |:--------------------- |:----------------- |:----------------- |
+| `std::map`, `std::set`                     | 16 bytes              | 32 bytes          | Yes               |
+| `std::unordered_map`, `std::unordered_set` | 128 bytes             | 16 - 24 bytes     | No                |
+| `base::flat_map`, `base::flat_set`         | 24 bytes              | 0 (see notes)     | No                |
+| `base::small_map`                          | 24 bytes (see notes)  | 32 bytes          | No                |
+
+**Takeaways:** `std::unordered_map` and `std::unordered_set` have high
+overhead for small container sizes, so prefer these only for larger workloads.
+
+Code size comparisons for a block of code (see appendix) on Windows using
+strings as keys.
+
+| Container            | Code size  |
+|:-------------------- |:---------- |
+| `std::unordered_map` | 1646 bytes |
+| `std::map`           | 1759 bytes |
+| `base::flat_map`     | 1872 bytes |
+| `base::small_map`    | 2410 bytes |
+
+**Takeaways:** `base::small_map` generates more code because of the inlining of
+both brute-force and red-black tree searching. This makes it less attractive
+for random one-off uses. But if your code is called frequently, the runtime
+memory benefits will be more important. The code sizes of the other maps are
+close enough it's not worth worrying about.
+
+### std::map and std::set
+
+A red-black tree. Each inserted item requires the memory allocation of a node
+on the heap. Each node contains a left pointer, a right pointer, a parent
+pointer, and a "color" for the red-black tree (32 bytes per item on 64-bit
+platforms).
+
+### std::unordered\_map and std::unordered\_set
+
+A hash table. Implemented on Windows as a `std::vector` + `std::list` and in libc++
+as the equivalent of a `std::vector` + a `std::forward_list`. Both implementations
+allocate an 8-entry hash table (containing iterators into the list) on
+initialization, and grow to 64 entries once 8 items are inserted. Above 64
+items, the size doubles every time the load factor exceeds 1.
+
+The empty size is `sizeof(std::unordered_map)` = 64 + the initial hash table
+size which is 8 pointers. The per-item overhead in the table above counts the
+list node (2 pointers on Windows, 1 pointer in libc++), plus amortizes the hash
+table assuming a 0.5 load factor on average.
+
+In a microbenchmark on Windows, inserts of 1M integers into a
+`std::unordered_set` took 1.07x the time of `std::set`, and queries took 0.67x
+the time of `std::set`. For a typical 4-entry set (the statistical mode of map
+sizes in the browser), query performance is identical to `std::set` and
+`base::flat_set`. On ARM, `std::unordered_set` performance can be worse because
+integer division to compute the bucket is slow, and a few "less than" operations
+can be faster than computing a hash depending on the key type. The takeaway is
+that you should not default to using unordered maps because "they're faster."
+
+### base::flat\_map and base::flat\_set
+
+A sorted `std::vector`. Seached via binary search, inserts in the middle require
+moving elements to make room. Good cache locality. For large objects and large
+set sizes, `std::vector`'s doubling-when-full strategy can waste memory.
+
+Supports efficient construction from a vector of items which avoids the O(n^2)
+insertion time of each element separately.
+
+The per-item overhead will depend on the underlying `std::vector`'s reallocation
+strategy and the memory access pattern. Assuming items are being linearly added,
+one would expect it to be 3/4 full, so per-item overhead will be 0.25 *
+sizeof(T).
+
+`flat_set` and `flat_map` support a notion of transparent comparisons.
+Therefore you can, for example, lookup `base::StringPiece` in a set of
+`std::strings` without constructing a temporary `std::string`. This
+functionality is based on C++14 extensions to the `std::set`/`std::map`
+interface.
+
+You can find more information about transparent comparisons in [the `less<void>`
+documentation](https://en.cppreference.com/w/cpp/utility/functional/less_void).
+
+Example, smart pointer set:
+
+```cpp
+// Declare a type alias using base::UniquePtrComparator.
+template <typename T>
+using UniquePtrSet = base::flat_set<std::unique_ptr<T>,
+                                    base::UniquePtrComparator>;
+
+// ...
+// Collect data.
+std::vector<std::unique_ptr<int>> ptr_vec;
+ptr_vec.reserve(5);
+std::generate_n(std::back_inserter(ptr_vec), 5, []{
+  return std::make_unique<int>(0);
+});
+
+// Construct a set.
+UniquePtrSet<int> ptr_set(std::move(ptr_vec));
+
+// Use raw pointers to lookup keys.
+int* ptr = ptr_set.begin()->get();
+EXPECT_TRUE(ptr_set.find(ptr) == ptr_set.begin());
+```
+
+Example `flat_map<std::string, int>`:
+
+```cpp
+base::flat_map<std::string, int> str_to_int({{"a", 1}, {"c", 2},{"b", 2}});
+
+// Does not construct temporary strings.
+str_to_int.find("c")->second = 3;
+str_to_int.erase("c");
+EXPECT_EQ(str_to_int.end(), str_to_int.find("c")->second);
+
+// NOTE: This does construct a temporary string. This happens since if the
+// item is not in the container, then it needs to be constructed, which is
+// something that transparent comparators don't have to guarantee.
+str_to_int["c"] = 3;
+```
+
+### base::small\_map
+
+A small inline buffer that is brute-force searched that overflows into a full
+`std::map` or `std::unordered_map`. This gives the memory benefit of
+`base::flat_map` for small data sizes without the degenerate insertion
+performance for large container sizes.
+
+Since instantiations require both code for a `std::map` and a brute-force search
+of the inline container, plus a fancy iterator to cover both cases, code size
+is larger.
+
+The initial size in the above table is assuming a very small inline table. The
+actual size will be `sizeof(int) + min(sizeof(std::map), sizeof(T) *
+inline_size)`.
+
+## Deque
+
+### Usage advice
+
+Chromium code should always use `base::circular_deque` or `base::queue` in
+preference to `std::deque` or `std::queue` due to memory usage and platform
+variation.
+
+The `base::circular_deque` implementation (and the `base::queue` which uses it)
+provide performance consistent across platforms that better matches most
+programmer's expectations on performance (it doesn't waste as much space as
+libc++ and doesn't do as many heap allocations as MSVC). It also generates less
+code tham `std::queue`: using it across the code base saves several hundred
+kilobytes.
+
+Since `base::deque` does not have stable iterators and it will move the objects
+it contains, it may not be appropriate for all uses. If you need these,
+consider using a `std::list` which will provide constant time insert and erase.
+
+### std::deque and std::queue
+
+The implementation of `std::deque` varies considerably which makes it hard to
+reason about. All implementations use a sequence of data blocks referenced by
+an array of pointers. The standard guarantees random access, amortized
+constant operations at the ends, and linear mutations in the middle.
+
+In Microsoft's implementation, each block is the smaller of 16 bytes or the
+size of the contained element. This means in practice that every expansion of
+the deque of non-trivial classes requires a heap allocation. libc++ (on Android
+and Mac) uses 4K blocks which eliminates the problem of many heap allocations,
+but generally wastes a large amount of space (an Android analysis revealed more
+than 2.5MB wasted space from deque alone, resulting in some optimizations).
+libstdc++ uses an intermediate-size 512-byte buffer.
+
+Microsoft's implementation never shrinks the deque capacity, so the capacity
+will always be the maximum number of elements ever contained. libstdc++
+deallocates blocks as they are freed. libc++ keeps up to two empty blocks.
+
+### base::circular_deque and base::queue
+
+A deque implemented as a circular buffer in an array. The underlying array will
+grow like a `std::vector` while the beginning and end of the deque will move
+around. The items will wrap around the underlying buffer so the storage will
+not be contiguous, but fast random access iterators are still possible.
+
+When the underlying buffer is filled, it will be reallocated and the constents
+moved (like a `std::vector`). The underlying buffer will be shrunk if there is
+too much wasted space (_unlike_ a `std::vector`). As a result, iterators are
+not stable across mutations.
+
+## Stack
+
+`std::stack` is like `std::queue` in that it is a wrapper around an underlying
+container. The default container is `std::deque` so everything from the deque
+section applies.
+
+Chromium provides `base/containers/stack.h` which defines `base::stack` that
+should be used in preference to `std::stack`. This changes the underlying
+container to `base::circular_deque`. The result will be very similar to
+manually specifying a `std::vector` for the underlying implementation except
+that the storage will shrink when it gets too empty (vector will never
+reallocate to a smaller size).
+
+Watch out: with some stack usage patterns it's easy to depend on unstable
+behavior:
+
+```cpp
+base::stack<Foo> stack;
+for (...) {
+  Foo& current = stack.top();
+  DoStuff();  // May call stack.push(), say if writing a parser.
+  current.done = true;  // Current may reference deleted item!
+}
+```
+
+## Safety
+
+Code throughout Chromium, running at any level of privilege, may directly or
+indirectly depend on these containers. Much calling code implicitly or
+explicitly assumes that these containers are safe, and won't corrupt memory.
+Unfortunately, [such assumptions have not always proven
+true](https://bugs.chromium.org/p/chromium/issues/detail?id=817982).
+
+Therefore, we are making an effort to ensure basic safety in these classes so
+that callers' assumptions are true. In particular, we are adding bounds checks,
+arithmetic overflow checks, and checks for internal invariants to the base
+containers where necessary. Here, safety means that the implementation will
+`CHECK`.
+
+As of 8 August 2018, we have added checks to the following classes:
+
+- `base::StringPiece`
+- `base::span`
+- `base::Optional`
+- `base::RingBuffer`
+- `base::small_map`
+
+Ultimately, all base containers will have these checks.
+
+### Safety, completeness, and efficiency
+
+Safety checks can affect performance at the micro-scale, although they do not
+always. On a larger scale, if we can have confidence that these fundamental
+classes and templates are minimally safe, we can sometimes avoid the security
+requirement to sandbox code that (for example) processes untrustworthy inputs.
+Sandboxing is a relatively heavyweight response to memory safety problems, and
+in our experience not all callers can afford to pay it.
+
+(However, where affordable, privilege separation and reduction remain Chrome
+Security Team's first approach to a variety of safety and security problems.)
+
+One can also imagine that the safety checks should be passed on to callers who
+require safety. There are several problems with that approach:
+
+- Not all authors of all call sites will always
+  - know when they need safety
+  - remember to write the checks
+  - write the checks correctly
+  - write the checks maximally efficiently, considering
+    - space
+    - time
+    - object code size
+- These classes typically do not document themselves as being unsafe
+- Some call sites have their requirements change over time
+  - Code that gets moved from a low-privilege process into a high-privilege
+    process
+  - Code that changes from accepting inputs from only trustworthy sources to
+    accepting inputs from all sources
+- Putting the checks in every call site results in strictly larger object code
+  than centralizing them in the callee
+
+Therefore, the minimal checks that we are adding to these base classes are the
+most efficient and effective way to achieve the beginning of the safety that we
+need. (Note that we cannot account for undefined behavior in callers.)
+
+## Appendix
+
+### Code for map code size comparison
+
+This just calls insert and query a number of times, with `printf`s that prevent
+things from being dead-code eliminated.
+
+```cpp
+TEST(Foo, Bar) {
+  base::small_map<std::map<std::string, Flubber>> foo;
+  foo.insert(std::make_pair("foo", Flubber(8, "bar")));
+  foo.insert(std::make_pair("bar", Flubber(8, "bar")));
+  foo.insert(std::make_pair("foo1", Flubber(8, "bar")));
+  foo.insert(std::make_pair("bar1", Flubber(8, "bar")));
+  foo.insert(std::make_pair("foo", Flubber(8, "bar")));
+  foo.insert(std::make_pair("bar", Flubber(8, "bar")));
+  auto found = foo.find("asdf");
+  printf("Found is %d\n", (int)(found == foo.end()));
+  found = foo.find("foo");
+  printf("Found is %d\n", (int)(found == foo.end()));
+  found = foo.find("bar");
+  printf("Found is %d\n", (int)(found == foo.end()));
+  found = foo.find("asdfhf");
+  printf("Found is %d\n", (int)(found == foo.end()));
+  found = foo.find("bar1");
+  printf("Found is %d\n", (int)(found == foo.end()));
+}
+```