mirror of
https://github.com/AetherDroid/android_kernel_samsung_on5xelte.git
synced 2025-09-05 07:57:45 -04:00
Fixed MTP to work with TWRP
This commit is contained in:
commit
f6dfaef42e
50820 changed files with 20846062 additions and 0 deletions
114
Documentation/infiniband/core_locking.txt
Normal file
114
Documentation/infiniband/core_locking.txt
Normal file
|
@ -0,0 +1,114 @@
|
|||
INFINIBAND MIDLAYER LOCKING
|
||||
|
||||
This guide is an attempt to make explicit the locking assumptions
|
||||
made by the InfiniBand midlayer. It describes the requirements on
|
||||
both low-level drivers that sit below the midlayer and upper level
|
||||
protocols that use the midlayer.
|
||||
|
||||
Sleeping and interrupt context
|
||||
|
||||
With the following exceptions, a low-level driver implementation of
|
||||
all of the methods in struct ib_device may sleep. The exceptions
|
||||
are any methods from the list:
|
||||
|
||||
create_ah
|
||||
modify_ah
|
||||
query_ah
|
||||
destroy_ah
|
||||
bind_mw
|
||||
post_send
|
||||
post_recv
|
||||
poll_cq
|
||||
req_notify_cq
|
||||
map_phys_fmr
|
||||
|
||||
which may not sleep and must be callable from any context.
|
||||
|
||||
The corresponding functions exported to upper level protocol
|
||||
consumers:
|
||||
|
||||
ib_create_ah
|
||||
ib_modify_ah
|
||||
ib_query_ah
|
||||
ib_destroy_ah
|
||||
ib_bind_mw
|
||||
ib_post_send
|
||||
ib_post_recv
|
||||
ib_req_notify_cq
|
||||
ib_map_phys_fmr
|
||||
|
||||
are therefore safe to call from any context.
|
||||
|
||||
In addition, the function
|
||||
|
||||
ib_dispatch_event
|
||||
|
||||
used by low-level drivers to dispatch asynchronous events through
|
||||
the midlayer is also safe to call from any context.
|
||||
|
||||
Reentrancy
|
||||
|
||||
All of the methods in struct ib_device exported by a low-level
|
||||
driver must be fully reentrant. The low-level driver is required to
|
||||
perform all synchronization necessary to maintain consistency, even
|
||||
if multiple function calls using the same object are run
|
||||
simultaneously.
|
||||
|
||||
The IB midlayer does not perform any serialization of function calls.
|
||||
|
||||
Because low-level drivers are reentrant, upper level protocol
|
||||
consumers are not required to perform any serialization. However,
|
||||
some serialization may be required to get sensible results. For
|
||||
example, a consumer may safely call ib_poll_cq() on multiple CPUs
|
||||
simultaneously. However, the ordering of the work completion
|
||||
information between different calls of ib_poll_cq() is not defined.
|
||||
|
||||
Callbacks
|
||||
|
||||
A low-level driver must not perform a callback directly from the
|
||||
same callchain as an ib_device method call. For example, it is not
|
||||
allowed for a low-level driver to call a consumer's completion event
|
||||
handler directly from its post_send method. Instead, the low-level
|
||||
driver should defer this callback by, for example, scheduling a
|
||||
tasklet to perform the callback.
|
||||
|
||||
The low-level driver is responsible for ensuring that multiple
|
||||
completion event handlers for the same CQ are not called
|
||||
simultaneously. The driver must guarantee that only one CQ event
|
||||
handler for a given CQ is running at a time. In other words, the
|
||||
following situation is not allowed:
|
||||
|
||||
CPU1 CPU2
|
||||
|
||||
low-level driver ->
|
||||
consumer CQ event callback:
|
||||
/* ... */
|
||||
ib_req_notify_cq(cq, ...);
|
||||
low-level driver ->
|
||||
/* ... */ consumer CQ event callback:
|
||||
/* ... */
|
||||
return from CQ event handler
|
||||
|
||||
The context in which completion event and asynchronous event
|
||||
callbacks run is not defined. Depending on the low-level driver, it
|
||||
may be process context, softirq context, or interrupt context.
|
||||
Upper level protocol consumers may not sleep in a callback.
|
||||
|
||||
Hot-plug
|
||||
|
||||
A low-level driver announces that a device is ready for use by
|
||||
consumers when it calls ib_register_device(), all initialization
|
||||
must be complete before this call. The device must remain usable
|
||||
until the driver's call to ib_unregister_device() has returned.
|
||||
|
||||
A low-level driver must call ib_register_device() and
|
||||
ib_unregister_device() from process context. It must not hold any
|
||||
semaphores that could cause deadlock if a consumer calls back into
|
||||
the driver across these calls.
|
||||
|
||||
An upper level protocol consumer may begin using an IB device as
|
||||
soon as the add method of its struct ib_client is called for that
|
||||
device. A consumer must finish all cleanup and free all resources
|
||||
relating to a device before returning from the remove method.
|
||||
|
||||
A consumer is permitted to sleep in its add and remove methods.
|
105
Documentation/infiniband/ipoib.txt
Normal file
105
Documentation/infiniband/ipoib.txt
Normal file
|
@ -0,0 +1,105 @@
|
|||
IP OVER INFINIBAND
|
||||
|
||||
The ib_ipoib driver is an implementation of the IP over InfiniBand
|
||||
protocol as specified by RFC 4391 and 4392, issued by the IETF ipoib
|
||||
working group. It is a "native" implementation in the sense of
|
||||
setting the interface type to ARPHRD_INFINIBAND and the hardware
|
||||
address length to 20 (earlier proprietary implementations
|
||||
masqueraded to the kernel as ethernet interfaces).
|
||||
|
||||
Partitions and P_Keys
|
||||
|
||||
When the IPoIB driver is loaded, it creates one interface for each
|
||||
port using the P_Key at index 0. To create an interface with a
|
||||
different P_Key, write the desired P_Key into the main interface's
|
||||
/sys/class/net/<intf name>/create_child file. For example:
|
||||
|
||||
echo 0x8001 > /sys/class/net/ib0/create_child
|
||||
|
||||
This will create an interface named ib0.8001 with P_Key 0x8001. To
|
||||
remove a subinterface, use the "delete_child" file:
|
||||
|
||||
echo 0x8001 > /sys/class/net/ib0/delete_child
|
||||
|
||||
The P_Key for any interface is given by the "pkey" file, and the
|
||||
main interface for a subinterface is in "parent."
|
||||
|
||||
Child interface create/delete can also be done using IPoIB's
|
||||
rtnl_link_ops, where childs created using either way behave the same.
|
||||
|
||||
Datagram vs Connected modes
|
||||
|
||||
The IPoIB driver supports two modes of operation: datagram and
|
||||
connected. The mode is set and read through an interface's
|
||||
/sys/class/net/<intf name>/mode file.
|
||||
|
||||
In datagram mode, the IB UD (Unreliable Datagram) transport is used
|
||||
and so the interface MTU has is equal to the IB L2 MTU minus the
|
||||
IPoIB encapsulation header (4 bytes). For example, in a typical IB
|
||||
fabric with a 2K MTU, the IPoIB MTU will be 2048 - 4 = 2044 bytes.
|
||||
|
||||
In connected mode, the IB RC (Reliable Connected) transport is used.
|
||||
Connected mode takes advantage of the connected nature of the IB
|
||||
transport and allows an MTU up to the maximal IP packet size of 64K,
|
||||
which reduces the number of IP packets needed for handling large UDP
|
||||
datagrams, TCP segments, etc and increases the performance for large
|
||||
messages.
|
||||
|
||||
In connected mode, the interface's UD QP is still used for multicast
|
||||
and communication with peers that don't support connected mode. In
|
||||
this case, RX emulation of ICMP PMTU packets is used to cause the
|
||||
networking stack to use the smaller UD MTU for these neighbours.
|
||||
|
||||
Stateless offloads
|
||||
|
||||
If the IB HW supports IPoIB stateless offloads, IPoIB advertises
|
||||
TCP/IP checksum and/or Large Send (LSO) offloading capability to the
|
||||
network stack.
|
||||
|
||||
Large Receive (LRO) offloading is also implemented and may be turned
|
||||
on/off using ethtool calls. Currently LRO is supported only for
|
||||
checksum offload capable devices.
|
||||
|
||||
Stateless offloads are supported only in datagram mode.
|
||||
|
||||
Interrupt moderation
|
||||
|
||||
If the underlying IB device supports CQ event moderation, one can
|
||||
use ethtool to set interrupt mitigation parameters and thus reduce
|
||||
the overhead incurred by handling interrupts. The main code path of
|
||||
IPoIB doesn't use events for TX completion signaling so only RX
|
||||
moderation is supported.
|
||||
|
||||
Debugging Information
|
||||
|
||||
By compiling the IPoIB driver with CONFIG_INFINIBAND_IPOIB_DEBUG set
|
||||
to 'y', tracing messages are compiled into the driver. They are
|
||||
turned on by setting the module parameters debug_level and
|
||||
mcast_debug_level to 1. These parameters can be controlled at
|
||||
runtime through files in /sys/module/ib_ipoib/.
|
||||
|
||||
CONFIG_INFINIBAND_IPOIB_DEBUG also enables files in the debugfs
|
||||
virtual filesystem. By mounting this filesystem, for example with
|
||||
|
||||
mount -t debugfs none /sys/kernel/debug
|
||||
|
||||
it is possible to get statistics about multicast groups from the
|
||||
files /sys/kernel/debug/ipoib/ib0_mcg and so on.
|
||||
|
||||
The performance impact of this option is negligible, so it
|
||||
is safe to enable this option with debug_level set to 0 for normal
|
||||
operation.
|
||||
|
||||
CONFIG_INFINIBAND_IPOIB_DEBUG_DATA enables even more debug output in
|
||||
the data path when data_debug_level is set to 1. However, even with
|
||||
the output disabled, enabling this configuration option will affect
|
||||
performance, because it adds tests to the fast path.
|
||||
|
||||
References
|
||||
|
||||
Transmission of IP over InfiniBand (IPoIB) (RFC 4391)
|
||||
http://ietf.org/rfc/rfc4391.txt
|
||||
IP over InfiniBand (IPoIB) Architecture (RFC 4392)
|
||||
http://ietf.org/rfc/rfc4392.txt
|
||||
IP over InfiniBand: Connected Mode (RFC 4755)
|
||||
http://ietf.org/rfc/rfc4755.txt
|
66
Documentation/infiniband/sysfs.txt
Normal file
66
Documentation/infiniband/sysfs.txt
Normal file
|
@ -0,0 +1,66 @@
|
|||
SYSFS FILES
|
||||
|
||||
For each InfiniBand device, the InfiniBand drivers create the
|
||||
following files under /sys/class/infiniband/<device name>:
|
||||
|
||||
node_type - Node type (CA, switch or router)
|
||||
node_guid - Node GUID
|
||||
sys_image_guid - System image GUID
|
||||
|
||||
In addition, there is a "ports" subdirectory, with one subdirectory
|
||||
for each port. For example, if mthca0 is a 2-port HCA, there will
|
||||
be two directories:
|
||||
|
||||
/sys/class/infiniband/mthca0/ports/1
|
||||
/sys/class/infiniband/mthca0/ports/2
|
||||
|
||||
(A switch will only have a single "0" subdirectory for switch port
|
||||
0; no subdirectory is created for normal switch ports)
|
||||
|
||||
In each port subdirectory, the following files are created:
|
||||
|
||||
cap_mask - Port capability mask
|
||||
lid - Port LID
|
||||
lid_mask_count - Port LID mask count
|
||||
rate - Port data rate (active width * active speed)
|
||||
sm_lid - Subnet manager LID for port's subnet
|
||||
sm_sl - Subnet manager SL for port's subnet
|
||||
state - Port state (DOWN, INIT, ARMED, ACTIVE or ACTIVE_DEFER)
|
||||
phys_state - Port physical state (Sleep, Polling, LinkUp, etc)
|
||||
|
||||
There is also a "counters" subdirectory, with files
|
||||
|
||||
VL15_dropped
|
||||
excessive_buffer_overrun_errors
|
||||
link_downed
|
||||
link_error_recovery
|
||||
local_link_integrity_errors
|
||||
port_rcv_constraint_errors
|
||||
port_rcv_data
|
||||
port_rcv_errors
|
||||
port_rcv_packets
|
||||
port_rcv_remote_physical_errors
|
||||
port_rcv_switch_relay_errors
|
||||
port_xmit_constraint_errors
|
||||
port_xmit_data
|
||||
port_xmit_discards
|
||||
port_xmit_packets
|
||||
symbol_error
|
||||
|
||||
Each of these files contains the corresponding value from the port's
|
||||
Performance Management PortCounters attribute, as described in
|
||||
section 16.1.3.5 of the InfiniBand Architecture Specification.
|
||||
|
||||
The "pkeys" and "gids" subdirectories contain one file for each
|
||||
entry in the port's P_Key or GID table respectively. For example,
|
||||
ports/1/pkeys/10 contains the value at index 10 in port 1's P_Key
|
||||
table.
|
||||
|
||||
MTHCA
|
||||
|
||||
The Mellanox HCA driver also creates the files:
|
||||
|
||||
hw_rev - Hardware revision number
|
||||
fw_ver - Firmware version
|
||||
hca_type - HCA type: "MT23108", "MT25208 (MT23108 compat mode)",
|
||||
or "MT25208"
|
153
Documentation/infiniband/user_mad.txt
Normal file
153
Documentation/infiniband/user_mad.txt
Normal file
|
@ -0,0 +1,153 @@
|
|||
USERSPACE MAD ACCESS
|
||||
|
||||
Device files
|
||||
|
||||
Each port of each InfiniBand device has a "umad" device and an
|
||||
"issm" device attached. For example, a two-port HCA will have two
|
||||
umad devices and two issm devices, while a switch will have one
|
||||
device of each type (for switch port 0).
|
||||
|
||||
Creating MAD agents
|
||||
|
||||
A MAD agent can be created by filling in a struct ib_user_mad_reg_req
|
||||
and then calling the IB_USER_MAD_REGISTER_AGENT ioctl on a file
|
||||
descriptor for the appropriate device file. If the registration
|
||||
request succeeds, a 32-bit id will be returned in the structure.
|
||||
For example:
|
||||
|
||||
struct ib_user_mad_reg_req req = { /* ... */ };
|
||||
ret = ioctl(fd, IB_USER_MAD_REGISTER_AGENT, (char *) &req);
|
||||
if (!ret)
|
||||
my_agent = req.id;
|
||||
else
|
||||
perror("agent register");
|
||||
|
||||
Agents can be unregistered with the IB_USER_MAD_UNREGISTER_AGENT
|
||||
ioctl. Also, all agents registered through a file descriptor will
|
||||
be unregistered when the descriptor is closed.
|
||||
|
||||
2014 -- a new registration ioctl is now provided which allows additional
|
||||
fields to be provided during registration.
|
||||
Users of this registration call are implicitly setting the use of
|
||||
pkey_index (see below).
|
||||
|
||||
Receiving MADs
|
||||
|
||||
MADs are received using read(). The receive side now supports
|
||||
RMPP. The buffer passed to read() must be at least one
|
||||
struct ib_user_mad + 256 bytes. For example:
|
||||
|
||||
If the buffer passed is not large enough to hold the received
|
||||
MAD (RMPP), the errno is set to ENOSPC and the length of the
|
||||
buffer needed is set in mad.length.
|
||||
|
||||
Example for normal MAD (non RMPP) reads:
|
||||
struct ib_user_mad *mad;
|
||||
mad = malloc(sizeof *mad + 256);
|
||||
ret = read(fd, mad, sizeof *mad + 256);
|
||||
if (ret != sizeof mad + 256) {
|
||||
perror("read");
|
||||
free(mad);
|
||||
}
|
||||
|
||||
Example for RMPP reads:
|
||||
struct ib_user_mad *mad;
|
||||
mad = malloc(sizeof *mad + 256);
|
||||
ret = read(fd, mad, sizeof *mad + 256);
|
||||
if (ret == -ENOSPC)) {
|
||||
length = mad.length;
|
||||
free(mad);
|
||||
mad = malloc(sizeof *mad + length);
|
||||
ret = read(fd, mad, sizeof *mad + length);
|
||||
}
|
||||
if (ret < 0) {
|
||||
perror("read");
|
||||
free(mad);
|
||||
}
|
||||
|
||||
In addition to the actual MAD contents, the other struct ib_user_mad
|
||||
fields will be filled in with information on the received MAD. For
|
||||
example, the remote LID will be in mad.lid.
|
||||
|
||||
If a send times out, a receive will be generated with mad.status set
|
||||
to ETIMEDOUT. Otherwise when a MAD has been successfully received,
|
||||
mad.status will be 0.
|
||||
|
||||
poll()/select() may be used to wait until a MAD can be read.
|
||||
|
||||
Sending MADs
|
||||
|
||||
MADs are sent using write(). The agent ID for sending should be
|
||||
filled into the id field of the MAD, the destination LID should be
|
||||
filled into the lid field, and so on. The send side does support
|
||||
RMPP so arbitrary length MAD can be sent. For example:
|
||||
|
||||
struct ib_user_mad *mad;
|
||||
|
||||
mad = malloc(sizeof *mad + mad_length);
|
||||
|
||||
/* fill in mad->data */
|
||||
|
||||
mad->hdr.id = my_agent; /* req.id from agent registration */
|
||||
mad->hdr.lid = my_dest; /* in network byte order... */
|
||||
/* etc. */
|
||||
|
||||
ret = write(fd, &mad, sizeof *mad + mad_length);
|
||||
if (ret != sizeof *mad + mad_length)
|
||||
perror("write");
|
||||
|
||||
Transaction IDs
|
||||
|
||||
Users of the umad devices can use the lower 32 bits of the
|
||||
transaction ID field (that is, the least significant half of the
|
||||
field in network byte order) in MADs being sent to match
|
||||
request/response pairs. The upper 32 bits are reserved for use by
|
||||
the kernel and will be overwritten before a MAD is sent.
|
||||
|
||||
P_Key Index Handling
|
||||
|
||||
The old ib_umad interface did not allow setting the P_Key index for
|
||||
MADs that are sent and did not provide a way for obtaining the P_Key
|
||||
index of received MADs. A new layout for struct ib_user_mad_hdr
|
||||
with a pkey_index member has been defined; however, to preserve binary
|
||||
compatibility with older applications, this new layout will not be used
|
||||
unless one of IB_USER_MAD_ENABLE_PKEY or IB_USER_MAD_REGISTER_AGENT2 ioctl's
|
||||
are called before a file descriptor is used for anything else.
|
||||
|
||||
In September 2008, the IB_USER_MAD_ABI_VERSION will be incremented
|
||||
to 6, the new layout of struct ib_user_mad_hdr will be used by
|
||||
default, and the IB_USER_MAD_ENABLE_PKEY ioctl will be removed.
|
||||
|
||||
Setting IsSM Capability Bit
|
||||
|
||||
To set the IsSM capability bit for a port, simply open the
|
||||
corresponding issm device file. If the IsSM bit is already set,
|
||||
then the open call will block until the bit is cleared (or return
|
||||
immediately with errno set to EAGAIN if the O_NONBLOCK flag is
|
||||
passed to open()). The IsSM bit will be cleared when the issm file
|
||||
is closed. No read, write or other operations can be performed on
|
||||
the issm file.
|
||||
|
||||
/dev files
|
||||
|
||||
To create the appropriate character device files automatically with
|
||||
udev, a rule like
|
||||
|
||||
KERNEL=="umad*", NAME="infiniband/%k"
|
||||
KERNEL=="issm*", NAME="infiniband/%k"
|
||||
|
||||
can be used. This will create device nodes named
|
||||
|
||||
/dev/infiniband/umad0
|
||||
/dev/infiniband/issm0
|
||||
|
||||
for the first port, and so on. The InfiniBand device and port
|
||||
associated with these devices can be determined from the files
|
||||
|
||||
/sys/class/infiniband_mad/umad0/ibdev
|
||||
/sys/class/infiniband_mad/umad0/port
|
||||
|
||||
and
|
||||
|
||||
/sys/class/infiniband_mad/issm0/ibdev
|
||||
/sys/class/infiniband_mad/issm0/port
|
69
Documentation/infiniband/user_verbs.txt
Normal file
69
Documentation/infiniband/user_verbs.txt
Normal file
|
@ -0,0 +1,69 @@
|
|||
USERSPACE VERBS ACCESS
|
||||
|
||||
The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS,
|
||||
enables direct userspace access to IB hardware via "verbs," as
|
||||
described in chapter 11 of the InfiniBand Architecture Specification.
|
||||
|
||||
To use the verbs, the libibverbs library, available from
|
||||
http://www.openfabrics.org/, is required. libibverbs contains a
|
||||
device-independent API for using the ib_uverbs interface.
|
||||
libibverbs also requires appropriate device-dependent kernel and
|
||||
userspace driver for your InfiniBand hardware. For example, to use
|
||||
a Mellanox HCA, you will need the ib_mthca kernel module and the
|
||||
libmthca userspace driver be installed.
|
||||
|
||||
User-kernel communication
|
||||
|
||||
Userspace communicates with the kernel for slow path, resource
|
||||
management operations via the /dev/infiniband/uverbsN character
|
||||
devices. Fast path operations are typically performed by writing
|
||||
directly to hardware registers mmap()ed into userspace, with no
|
||||
system call or context switch into the kernel.
|
||||
|
||||
Commands are sent to the kernel via write()s on these device files.
|
||||
The ABI is defined in drivers/infiniband/include/ib_user_verbs.h.
|
||||
The structs for commands that require a response from the kernel
|
||||
contain a 64-bit field used to pass a pointer to an output buffer.
|
||||
Status is returned to userspace as the return value of the write()
|
||||
system call.
|
||||
|
||||
Resource management
|
||||
|
||||
Since creation and destruction of all IB resources is done by
|
||||
commands passed through a file descriptor, the kernel can keep track
|
||||
of which resources are attached to a given userspace context. The
|
||||
ib_uverbs module maintains idr tables that are used to translate
|
||||
between kernel pointers and opaque userspace handles, so that kernel
|
||||
pointers are never exposed to userspace and userspace cannot trick
|
||||
the kernel into following a bogus pointer.
|
||||
|
||||
This also allows the kernel to clean up when a process exits and
|
||||
prevent one process from touching another process's resources.
|
||||
|
||||
Memory pinning
|
||||
|
||||
Direct userspace I/O requires that memory regions that are potential
|
||||
I/O targets be kept resident at the same physical address. The
|
||||
ib_uverbs module manages pinning and unpinning memory regions via
|
||||
get_user_pages() and put_page() calls. It also accounts for the
|
||||
amount of memory pinned in the process's locked_vm, and checks that
|
||||
unprivileged processes do not exceed their RLIMIT_MEMLOCK limit.
|
||||
|
||||
Pages that are pinned multiple times are counted each time they are
|
||||
pinned, so the value of locked_vm may be an overestimate of the
|
||||
number of pages pinned by a process.
|
||||
|
||||
/dev files
|
||||
|
||||
To create the appropriate character device files automatically with
|
||||
udev, a rule like
|
||||
|
||||
KERNEL=="uverbs*", NAME="infiniband/%k"
|
||||
|
||||
can be used. This will create device nodes named
|
||||
|
||||
/dev/infiniband/uverbs0
|
||||
|
||||
and so on. Since the InfiniBand userspace verbs should be safe for
|
||||
use by non-privileged processes, it may be useful to add an
|
||||
appropriate MODE or GROUP to the udev rule.
|
Loading…
Add table
Add a link
Reference in a new issue