Fixed MTP to work with TWRP

This commit is contained in:
awab228 2018-06-19 23:16:04 +02:00
commit f6dfaef42e
50820 changed files with 20846062 additions and 0 deletions

View file

@ -0,0 +1,234 @@
00-INDEX
- this file
3c505.txt
- information on the 3Com EtherLink Plus (3c505) driver.
3c509.txt
- information on the 3Com Etherlink III Series Ethernet cards.
6pack.txt
- info on the 6pack protocol, an alternative to KISS for AX.25
LICENSE.qla3xxx
- GPLv2 for QLogic Linux Networking HBA Driver
LICENSE.qlge
- GPLv2 for QLogic Linux qlge NIC Driver
LICENSE.qlcnic
- GPLv2 for QLogic Linux qlcnic NIC Driver
Makefile
- Makefile for docsrc.
PLIP.txt
- PLIP: The Parallel Line Internet Protocol device driver
README.ipw2100
- README for the Intel PRO/Wireless 2100 driver.
README.ipw2200
- README for the Intel PRO/Wireless 2915ABG and 2200BG driver.
README.sb1000
- info on General Instrument/NextLevel SURFboard1000 cable modem.
alias.txt
- info on using alias network devices.
arcnet-hardware.txt
- tons of info on ARCnet, hubs, jumper settings for ARCnet cards, etc.
arcnet.txt
- info on the using the ARCnet driver itself.
atm.txt
- info on where to get ATM programs and support for Linux.
ax25.txt
- info on using AX.25 and NET/ROM code for Linux
batman-adv.txt
- B.A.T.M.A.N routing protocol on top of layer 2 Ethernet Frames.
baycom.txt
- info on the driver for Baycom style amateur radio modems
bonding.txt
- Linux Ethernet Bonding Driver HOWTO: link aggregation in Linux.
bridge.txt
- where to get user space programs for ethernet bridging with Linux.
can.txt
- documentation on CAN protocol family.
cops.txt
- info on the COPS LocalTalk Linux driver
cs89x0.txt
- the Crystal LAN (CS8900/20-based) Ethernet ISA adapter driver
cxacru.txt
- Conexant AccessRunner USB ADSL Modem
cxacru-cf.py
- Conexant AccessRunner USB ADSL Modem configuration file parser
cxgb.txt
- Release Notes for the Chelsio N210 Linux device driver.
dccp.txt
- the Datagram Congestion Control Protocol (DCCP) (RFC 4340..42).
de4x5.txt
- the Digital EtherWORKS DE4?? and DE5?? PCI Ethernet driver
decnet.txt
- info on using the DECnet networking layer in Linux.
dl2k.txt
- README for D-Link DL2000-based Gigabit Ethernet Adapters (dl2k.ko).
dm9000.txt
- README for the Simtec DM9000 Network driver.
dmfe.txt
- info on the Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver.
dns_resolver.txt
- The DNS resolver module allows kernel servies to make DNS queries.
driver.txt
- Softnet driver issues.
e100.txt
- info on Intel's EtherExpress PRO/100 line of 10/100 boards
e1000.txt
- info on Intel's E1000 line of gigabit ethernet boards
e1000e.txt
- README for the Intel Gigabit Ethernet Driver (e1000e).
eql.txt
- serial IP load balancing
fib_trie.txt
- Level Compressed Trie (LC-trie) notes: a structure for routing.
filter.txt
- Linux Socket Filtering
fore200e.txt
- FORE Systems PCA-200E/SBA-200E ATM NIC driver info.
framerelay.txt
- info on using Frame Relay/Data Link Connection Identifier (DLCI).
gen_stats.txt
- Generic networking statistics for netlink users.
generic-hdlc.txt
- The generic High Level Data Link Control (HDLC) layer.
generic_netlink.txt
- info on Generic Netlink
gianfar.txt
- Gianfar Ethernet Driver.
i40e.txt
- README for the Intel Ethernet Controller XL710 Driver (i40e).
i40evf.txt
- Short note on the Driver for the Intel(R) XL710 X710 Virtual Function
ieee802154.txt
- Linux IEEE 802.15.4 implementation, API and drivers
igb.txt
- README for the Intel Gigabit Ethernet Driver (igb).
igbvf.txt
- README for the Intel Gigabit Ethernet Driver (igbvf).
ip-sysctl.txt
- /proc/sys/net/ipv4/* variables
ip_dynaddr.txt
- IP dynamic address hack e.g. for auto-dialup links
ipddp.txt
- AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation
iphase.txt
- Interphase PCI ATM (i)Chip IA Linux driver info.
ipsec.txt
- Note on not compressing IPSec payload and resulting failed policy check.
ipv6.txt
- Options to the ipv6 kernel module.
ipvs-sysctl.txt
- Per-inode explanation of the /proc/sys/net/ipv4/vs interface.
irda.txt
- where to get IrDA (infrared) utilities and info for Linux.
ixgb.txt
- README for the Intel 10 Gigabit Ethernet Driver (ixgb).
ixgbe.txt
- README for the Intel 10 Gigabit Ethernet Driver (ixgbe).
ixgbevf.txt
- README for the Intel Virtual Function (VF) Driver (ixgbevf).
l2tp.txt
- User guide to the L2TP tunnel protocol.
lapb-module.txt
- programming information of the LAPB module.
ltpc.txt
- the Apple or Farallon LocalTalk PC card driver
mac80211-auth-assoc-deauth.txt
- authentication and association / deauth-disassoc with max80211
mac80211-injection.txt
- HOWTO use packet injection with mac80211
multiqueue.txt
- HOWTO for multiqueue network device support.
netconsole.txt
- The network console module netconsole.ko: configuration and notes.
netdev-FAQ.txt
- FAQ describing how to submit net changes to netdev mailing list.
netdev-features.txt
- Network interface features API description.
netdevices.txt
- info on network device driver functions exported to the kernel.
netif-msg.txt
- Design of the network interface message level setting (NETIF_MSG_*).
netlink_mmap.txt
- memory mapped I/O with netlink
nf_conntrack-sysctl.txt
- list of netfilter-sysctl knobs.
nfc.txt
- The Linux Near Field Communication (NFS) subsystem.
openvswitch.txt
- Open vSwitch developer documentation.
operstates.txt
- Overview of network interface operational states.
packet_mmap.txt
- User guide to memory mapped packet socket rings (PACKET_[RT]X_RING).
phonet.txt
- The Phonet packet protocol used in Nokia cellular modems.
phy.txt
- The PHY abstraction layer.
pktgen.txt
- User guide to the kernel packet generator (pktgen.ko).
policy-routing.txt
- IP policy-based routing
ppp_generic.txt
- Information about the generic PPP driver.
proc_net_tcp.txt
- Per inode overview of the /proc/net/tcp and /proc/net/tcp6 interfaces.
radiotap-headers.txt
- Background on radiotap headers.
ray_cs.txt
- Raylink Wireless LAN card driver info.
rds.txt
- Background on the reliable, ordered datagram delivery method RDS.
regulatory.txt
- Overview of the Linux wireless regulatory infrastructure.
rxrpc.txt
- Guide to the RxRPC protocol.
s2io.txt
- Release notes for Neterion Xframe I/II 10GbE driver.
scaling.txt
- Explanation of network scaling techniques: RSS, RPS, RFS, aRFS, XPS.
sctp.txt
- Notes on the Linux kernel implementation of the SCTP protocol.
secid.txt
- Explanation of the secid member in flow structures.
skfp.txt
- SysKonnect FDDI (SK-5xxx, Compaq Netelligent) driver info.
smc9.txt
- the driver for SMC's 9000 series of Ethernet cards
spider_net.txt
- README for the Spidernet Driver (as found in PS3 / Cell BE).
stmmac.txt
- README for the STMicro Synopsys Ethernet driver.
tc-actions-env-rules.txt
- rules for traffic control (tc) actions.
timestamping.txt
- overview of network packet timestamping variants.
tcp.txt
- short blurb on how TCP output takes place.
tcp-thin.txt
- kernel tuning options for low rate 'thin' TCP streams.
team.txt
- pointer to information for ethernet teaming devices.
tlan.txt
- ThunderLAN (Compaq Netelligent 10/100, Olicom OC-2xxx) driver info.
tproxy.txt
- Transparent proxy support user guide.
tuntap.txt
- TUN/TAP device driver, allowing user space Rx/Tx of packets.
udplite.txt
- UDP-Lite protocol (RFC 3828) introduction.
vortex.txt
- info on using 3Com Vortex (3c590, 3c592, 3c595, 3c597) Ethernet cards.
vxge.txt
- README for the Neterion X3100 PCIe Server Adapter.
vxlan.txt
- Virtual extensible LAN overview
x25.txt
- general info on X.25 development.
x25-iface.txt
- description of the X.25 Packet Layer to LAPB device interface.
xfrm_proc.txt
- description of the statistics package for XFRM.
xfrm_sync.txt
- sync patches for XFRM enable migration of an SA between hosts.
xfrm_sysctl.txt
- description of the XFRM configuration options.
z8530drv.txt
- info about Linux driver for Z8530 based HDLC cards for AX.25

View file

@ -0,0 +1,213 @@
Linux and the 3Com EtherLink III Series Ethercards (driver v1.18c and higher)
----------------------------------------------------------------------------
This file contains the instructions and caveats for v1.18c and higher versions
of the 3c509 driver. You should not use the driver without reading this file.
release 1.0
28 February 2002
Current maintainer (corrections to):
David Ruggiero <jdr@farfalle.com>
----------------------------------------------------------------------------
(0) Introduction
The following are notes and information on using the 3Com EtherLink III series
ethercards in Linux. These cards are commonly known by the most widely-used
card's 3Com model number, 3c509. They are all 10mb/s ISA-bus cards and shouldn't
be (but sometimes are) confused with the similarly-numbered PCI-bus "3c905"
(aka "Vortex" or "Boomerang") series. Kernel support for the 3c509 family is
provided by the module 3c509.c, which has code to support all of the following
models:
3c509 (original ISA card)
3c509B (later revision of the ISA card; supports full-duplex)
3c589 (PCMCIA)
3c589B (later revision of the 3c589; supports full-duplex)
3c579 (EISA)
Large portions of this documentation were heavily borrowed from the guide
written the original author of the 3c509 driver, Donald Becker. The master
copy of that document, which contains notes on older versions of the driver,
currently resides on Scyld web server: http://www.scyld.com/.
(1) Special Driver Features
Overriding card settings
The driver allows boot- or load-time overriding of the card's detected IOADDR,
IRQ, and transceiver settings, although this capability shouldn't generally be
needed except to enable full-duplex mode (see below). An example of the syntax
for LILO parameters for doing this:
ether=10,0x310,3,0x3c509,eth0
This configures the first found 3c509 card for IRQ 10, base I/O 0x310, and
transceiver type 3 (10base2). The flag "0x3c509" must be set to avoid conflicts
with other card types when overriding the I/O address. When the driver is
loaded as a module, only the IRQ may be overridden. For example,
setting two cards to IRQ10 and IRQ11 is done by using the irq module
option:
options 3c509 irq=10,11
(2) Full-duplex mode
The v1.18c driver added support for the 3c509B's full-duplex capabilities.
In order to enable and successfully use full-duplex mode, three conditions
must be met:
(a) You must have a Etherlink III card model whose hardware supports full-
duplex operations. Currently, the only members of the 3c509 family that are
positively known to support full-duplex are the 3c509B (ISA bus) and 3c589B
(PCMCIA) cards. Cards without the "B" model designation do *not* support
full-duplex mode; these include the original 3c509 (no "B"), the original
3c589, the 3c529 (MCA bus), and the 3c579 (EISA bus).
(b) You must be using your card's 10baseT transceiver (i.e., the RJ-45
connector), not its AUI (thick-net) or 10base2 (thin-net/coax) interfaces.
AUI and 10base2 network cabling is physically incapable of full-duplex
operation.
(c) Most importantly, your 3c509B must be connected to a link partner that is
itself full-duplex capable. This is almost certainly one of two things: a full-
duplex-capable Ethernet switch (*not* a hub), or a full-duplex-capable NIC on
another system that's connected directly to the 3c509B via a crossover cable.
Full-duplex mode can be enabled using 'ethtool'.
/////Extremely important caution concerning full-duplex mode/////
Understand that the 3c509B's hardware's full-duplex support is much more
limited than that provide by more modern network interface cards. Although
at the physical layer of the network it fully supports full-duplex operation,
the card was designed before the current Ethernet auto-negotiation (N-way)
spec was written. This means that the 3c509B family ***cannot and will not
auto-negotiate a full-duplex connection with its link partner under any
circumstances, no matter how it is initialized***. If the full-duplex mode
of the 3c509B is enabled, its link partner will very likely need to be
independently _forced_ into full-duplex mode as well; otherwise various nasty
failures will occur - at the very least, you'll see massive numbers of packet
collisions. This is one of very rare circumstances where disabling auto-
negotiation and forcing the duplex mode of a network interface card or switch
would ever be necessary or desirable.
(3) Available Transceiver Types
For versions of the driver v1.18c and above, the available transceiver types are:
0 transceiver type from EEPROM config (normally 10baseT); force half-duplex
1 AUI (thick-net / DB15 connector)
2 (undefined)
3 10base2 (thin-net == coax / BNC connector)
4 10baseT (RJ-45 connector); force half-duplex mode
8 transceiver type and duplex mode taken from card's EEPROM config settings
12 10baseT (RJ-45 connector); force full-duplex mode
Prior to driver version 1.18c, only transceiver codes 0-4 were supported. Note
that the new transceiver codes 8 and 12 are the *only* ones that will enable
full-duplex mode, no matter what the card's detected EEPROM settings might be.
This insured that merely upgrading the driver from an earlier version would
never automatically enable full-duplex mode in an existing installation;
it must always be explicitly enabled via one of these code in order to be
activated.
The transceiver type can be changed using 'ethtool'.
(4a) Interpretation of error messages and common problems
Error Messages
eth0: Infinite loop in interrupt, status 2011.
These are "mostly harmless" message indicating that the driver had too much
work during that interrupt cycle. With a status of 0x2011 you are receiving
packets faster than they can be removed from the card. This should be rare
or impossible in normal operation. Possible causes of this error report are:
- a "green" mode enabled that slows the processor down when there is no
keyboard activity.
- some other device or device driver hogging the bus or disabling interrupts.
Check /proc/interrupts for excessive interrupt counts. The timer tick
interrupt should always be incrementing faster than the others.
No received packets
If a 3c509, 3c562 or 3c589 can successfully transmit packets, but never
receives packets (as reported by /proc/net/dev or 'ifconfig') you likely
have an interrupt line problem. Check /proc/interrupts to verify that the
card is actually generating interrupts. If the interrupt count is not
increasing you likely have a physical conflict with two devices trying to
use the same ISA IRQ line. The common conflict is with a sound card on IRQ10
or IRQ5, and the easiest solution is to move the 3c509 to a different
interrupt line. If the device is receiving packets but 'ping' doesn't work,
you have a routing problem.
Tx Carrier Errors Reported in /proc/net/dev
If an EtherLink III appears to transmit packets, but the "Tx carrier errors"
field in /proc/net/dev increments as quickly as the Tx packet count, you
likely have an unterminated network or the incorrect media transceiver selected.
3c509B card is not detected on machines with an ISA PnP BIOS.
While the updated driver works with most PnP BIOS programs, it does not work
with all. This can be fixed by disabling PnP support using the 3Com-supplied
setup program.
3c509 card is not detected on overclocked machines
Increase the delay time in id_read_eeprom() from the current value, 500,
to an absurdly high value, such as 5000.
(4b) Decoding Status and Error Messages
The bits in the main status register are:
value description
0x01 Interrupt latch
0x02 Tx overrun, or Rx underrun
0x04 Tx complete
0x08 Tx FIFO room available
0x10 A complete Rx packet has arrived
0x20 A Rx packet has started to arrive
0x40 The driver has requested an interrupt
0x80 Statistics counter nearly full
The bits in the transmit (Tx) status word are:
value description
0x02 Out-of-window collision.
0x04 Status stack overflow (normally impossible).
0x08 16 collisions.
0x10 Tx underrun (not enough PCI bus bandwidth).
0x20 Tx jabber.
0x40 Tx interrupt requested.
0x80 Status is valid (this should always be set).
When a transmit error occurs the driver produces a status message such as
eth0: Transmit error, Tx status register 82
The two values typically seen here are:
0x82
Out of window collision. This typically occurs when some other Ethernet
host is incorrectly set to full duplex on a half duplex network.
0x88
16 collisions. This typically occurs when the network is exceptionally busy
or when another host doesn't correctly back off after a collision. If this
error is mixed with 0x82 errors it is the result of a host incorrectly set
to full duplex (see above).
Both of these errors are the result of network problems that should be
corrected. They do not represent driver malfunction.
(5) Revision history (this file)
28Feb02 v1.0 DR New; major portions based on Becker original 3c509 docs

View file

@ -0,0 +1,175 @@
This is the 6pack-mini-HOWTO, written by
Andreas Könsgen DG3KQ
Internet: ajk@comnets.uni-bremen.de
AMPR-net: dg3kq@db0pra.ampr.org
AX.25: dg3kq@db0ach.#nrw.deu.eu
Last update: April 7, 1998
1. What is 6pack, and what are the advantages to KISS?
6pack is a transmission protocol for data exchange between the PC and
the TNC over a serial line. It can be used as an alternative to KISS.
6pack has two major advantages:
- The PC is given full control over the radio
channel. Special control data is exchanged between the PC and the TNC so
that the PC knows at any time if the TNC is receiving data, if a TNC
buffer underrun or overrun has occurred, if the PTT is
set and so on. This control data is processed at a higher priority than
normal data, so a data stream can be interrupted at any time to issue an
important event. This helps to improve the channel access and timing
algorithms as everything is computed in the PC. It would even be possible
to experiment with something completely different from the known CSMA and
DAMA channel access methods.
This kind of real-time control is especially important to supply several
TNCs that are connected between each other and the PC by a daisy chain
(however, this feature is not supported yet by the Linux 6pack driver).
- Each packet transferred over the serial line is supplied with a checksum,
so it is easy to detect errors due to problems on the serial line.
Received packets that are corrupt are not passed on to the AX.25 layer.
Damaged packets that the TNC has received from the PC are not transmitted.
More details about 6pack are described in the file 6pack.ps that is located
in the doc directory of the AX.25 utilities package.
2. Who has developed the 6pack protocol?
The 6pack protocol has been developed by Ekki Plicht DF4OR, Henning Rech
DF9IC and Gunter Jost DK7WJ. A driver for 6pack, written by Gunter Jost and
Matthias Welwarsky DG2FEF, comes along with the PC version of FlexNet.
They have also written a firmware for TNCs to perform the 6pack
protocol (see section 4 below).
3. Where can I get the latest version of 6pack for LinuX?
At the moment, the 6pack stuff can obtained via anonymous ftp from
db0bm.automation.fh-aachen.de. In the directory /incoming/dg3kq,
there is a file named 6pack.tgz.
4. Preparing the TNC for 6pack operation
To be able to use 6pack, a special firmware for the TNC is needed. The EPROM
of a newly bought TNC does not contain 6pack, so you will have to
program an EPROM yourself. The image file for 6pack EPROMs should be
available on any packet radio box where PC/FlexNet can be found. The name of
the file is 6pack.bin. This file is copyrighted and maintained by the FlexNet
team. It can be used under the terms of the license that comes along
with PC/FlexNet. Please do not ask me about the internals of this file as I
don't know anything about it. I used a textual description of the 6pack
protocol to program the Linux driver.
TNCs contain a 64kByte EPROM, the lower half of which is used for
the firmware/KISS. The upper half is either empty or is sometimes
programmed with software called TAPR. In the latter case, the TNC
is supplied with a DIP switch so you can easily change between the
two systems. When programming a new EPROM, one of the systems is replaced
by 6pack. It is useful to replace TAPR, as this software is rarely used
nowadays. If your TNC is not equipped with the switch mentioned above, you
can build in one yourself that switches over the highest address pin
of the EPROM between HIGH and LOW level. After having inserted the new EPROM
and switched to 6pack, apply power to the TNC for a first test. The connect
and the status LED are lit for about a second if the firmware initialises
the TNC correctly.
5. Building and installing the 6pack driver
The driver has been tested with kernel version 2.1.90. Use with older
kernels may lead to a compilation error because the interface to a kernel
function has been changed in the 2.1.8x kernels.
How to turn on 6pack support:
- In the linux kernel configuration program, select the code maturity level
options menu and turn on the prompting for development drivers.
- Select the amateur radio support menu and turn on the serial port 6pack
driver.
- Compile and install the kernel and the modules.
To use the driver, the kissattach program delivered with the AX.25 utilities
has to be modified.
- Do a cd to the directory that holds the kissattach sources. Edit the
kissattach.c file. At the top, insert the following lines:
#ifndef N_6PACK
#define N_6PACK (N_AX25+1)
#endif
Then find the line
int disc = N_AX25;
and replace N_AX25 by N_6PACK.
- Recompile kissattach. Rename it to spattach to avoid confusions.
Installing the driver:
- Do an insmod 6pack. Look at your /var/log/messages file to check if the
module has printed its initialization message.
- Do a spattach as you would launch kissattach when starting a KISS port.
Check if the kernel prints the message '6pack: TNC found'.
- From here, everything should work as if you were setting up a KISS port.
The only difference is that the network device that represents
the 6pack port is called sp instead of sl or ax. So, sp0 would be the
first 6pack port.
Although the driver has been tested on various platforms, I still declare it
ALPHA. BE CAREFUL! Sync your disks before insmoding the 6pack module
and spattaching. Watch out if your computer behaves strangely. Read section
6 of this file about known problems.
Note that the connect and status LEDs of the TNC are controlled in a
different way than they are when the TNC is used with PC/FlexNet. When using
FlexNet, the connect LED is on if there is a connection; the status LED is
on if there is data in the buffer of the PC's AX.25 engine that has to be
transmitted. Under Linux, the 6pack layer is beyond the AX.25 layer,
so the 6pack driver doesn't know anything about connects or data that
has not yet been transmitted. Therefore the LEDs are controlled
as they are in KISS mode: The connect LED is turned on if data is transferred
from the PC to the TNC over the serial line, the status LED if data is
sent to the PC.
6. Known problems
When testing the driver with 2.0.3x kernels and
operating with data rates on the radio channel of 9600 Baud or higher,
the driver may, on certain systems, sometimes print the message '6pack:
bad checksum', which is due to data loss if the other station sends two
or more subsequent packets. I have been told that this is due to a problem
with the serial driver of 2.0.3x kernels. I don't know yet if the problem
still exists with 2.1.x kernels, as I have heard that the serial driver
code has been changed with 2.1.x.
When shutting down the sp interface with ifconfig, the kernel crashes if
there is still an AX.25 connection left over which an IP connection was
running, even if that IP connection is already closed. The problem does not
occur when there is a bare AX.25 connection still running. I don't know if
this is a problem of the 6pack driver or something else in the kernel.
The driver has been tested as a module, not yet as a kernel-builtin driver.
The 6pack protocol supports daisy-chaining of TNCs in a token ring, which is
connected to one serial port of the PC. This feature is not implemented
and at least at the moment I won't be able to do it because I do not have
the opportunity to build a TNC daisy-chain and test it.
Some of the comments in the source code are inaccurate. They are left from
the SLIP/KISS driver, from which the 6pack driver has been derived.
I haven't modified or removed them yet -- sorry! The code itself needs
some cleaning and optimizing. This will be done in a later release.
If you encounter a bug or if you have a question or suggestion concerning the
driver, feel free to mail me, using the addresses given at the beginning of
this file.
Have fun!
Andreas

View file

@ -0,0 +1,46 @@
Copyright (c) 2003-2006 QLogic Corporation
QLogic Linux Networking HBA Driver
This program includes a device driver for Linux 2.6 that may be
distributed with QLogic hardware specific firmware binary file.
You may modify and redistribute the device driver code under the
GNU General Public License as published by the Free Software
Foundation (version 2 or a later version).
You may redistribute the hardware specific firmware binary file
under the following terms:
1. Redistribution of source code (only if applicable),
must retain the above copyright notice, this list of
conditions and the following disclaimer.
2. Redistribution in binary form must reproduce the above
copyright notice, this list of conditions and the
following disclaimer in the documentation and/or other
materials provided with the distribution.
3. The name of QLogic Corporation may not be used to
endorse or promote products derived from this software
without specific prior written permission
REGARDLESS OF WHAT LICENSING MECHANISM IS USED OR APPLICABLE,
THIS PROGRAM IS PROVIDED BY QLOGIC CORPORATION "AS IS'' AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
USER ACKNOWLEDGES AND AGREES THAT USE OF THIS PROGRAM WILL NOT
CREATE OR GIVE GROUNDS FOR A LICENSE BY IMPLICATION, ESTOPPEL, OR
OTHERWISE IN ANY INTELLECTUAL PROPERTY RIGHTS (PATENT, COPYRIGHT,
TRADE SECRET, MASK WORK, OR OTHER PROPRIETARY RIGHT) EMBODIED IN
ANY OTHER QLOGIC HARDWARE OR SOFTWARE EITHER SOLELY OR IN
COMBINATION WITH THIS PROGRAM.

View file

@ -0,0 +1,288 @@
Copyright (c) 2009-2013 QLogic Corporation
QLogic Linux qlcnic NIC Driver
You may modify and redistribute the device driver code under the
GNU General Public License (a copy of which is attached hereto as
Exhibit A) published by the Free Software Foundation (version 2).
EXHIBIT A
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.

View file

@ -0,0 +1,288 @@
Copyright (c) 2003-2011 QLogic Corporation
QLogic Linux qlge NIC Driver
You may modify and redistribute the device driver code under the
GNU General Public License (a copy of which is attached hereto as
Exhibit A) published by the Free Software Foundation (version 2).
EXHIBIT A
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.

View file

@ -0,0 +1 @@
subdir-y := timestamping

View file

@ -0,0 +1,215 @@
PLIP: The Parallel Line Internet Protocol Device
Donald Becker (becker@super.org)
I.D.A. Supercomputing Research Center, Bowie MD 20715
At some point T. Thorn will probably contribute text,
Tommy Thorn (tthorn@daimi.aau.dk)
PLIP Introduction
-----------------
This document describes the parallel port packet pusher for Net/LGX.
This device interface allows a point-to-point connection between two
parallel ports to appear as a IP network interface.
What is PLIP?
=============
PLIP is Parallel Line IP, that is, the transportation of IP packages
over a parallel port. In the case of a PC, the obvious choice is the
printer port. PLIP is a non-standard, but [can use] uses the standard
LapLink null-printer cable [can also work in turbo mode, with a PLIP
cable]. [The protocol used to pack IP packages, is a simple one
initiated by Crynwr.]
Advantages of PLIP
==================
It's cheap, it's available everywhere, and it's easy.
The PLIP cable is all that's needed to connect two Linux boxes, and it
can be built for very few bucks.
Connecting two Linux boxes takes only a second's decision and a few
minutes' work, no need to search for a [supported] netcard. This might
even be especially important in the case of notebooks, where netcards
are not easily available.
Not requiring a netcard also means that apart from connecting the
cables, everything else is software configuration [which in principle
could be made very easy.]
Disadvantages of PLIP
=====================
Doesn't work over a modem, like SLIP and PPP. Limited range, 15 m.
Can only be used to connect three (?) Linux boxes. Doesn't connect to
an existing Ethernet. Isn't standard (not even de facto standard, like
SLIP).
Performance
===========
PLIP easily outperforms Ethernet cards....(ups, I was dreaming, but
it *is* getting late. EOB)
PLIP driver details
-------------------
The Linux PLIP driver is an implementation of the original Crynwr protocol,
that uses the parallel port subsystem of the kernel in order to properly
share parallel ports between PLIP and other services.
IRQs and trigger timeouts
=========================
When a parallel port used for a PLIP driver has an IRQ configured to it, the
PLIP driver is signaled whenever data is sent to it via the cable, such that
when no data is available, the driver isn't being used.
However, on some machines it is hard, if not impossible, to configure an IRQ
to a certain parallel port, mainly because it is used by some other device.
On these machines, the PLIP driver can be used in IRQ-less mode, where
the PLIP driver would constantly poll the parallel port for data waiting,
and if such data is available, process it. This mode is less efficient than
the IRQ mode, because the driver has to check the parallel port many times
per second, even when no data at all is sent. Some rough measurements
indicate that there isn't a noticeable performance drop when using IRQ-less
mode as compared to IRQ mode as far as the data transfer speed is involved.
There is a performance drop on the machine hosting the driver.
When the PLIP driver is used in IRQ mode, the timeout used for triggering a
data transfer (the maximal time the PLIP driver would allow the other side
before announcing a timeout, when trying to handshake a transfer of some
data) is, by default, 500usec. As IRQ delivery is more or less immediate,
this timeout is quite sufficient.
When in IRQ-less mode, the PLIP driver polls the parallel port HZ times
per second (where HZ is typically 100 on most platforms, and 1024 on an
Alpha, as of this writing). Between two such polls, there are 10^6/HZ usecs.
On an i386, for example, 10^6/100 = 10000usec. It is easy to see that it is
quite possible for the trigger timeout to expire between two such polls, as
the timeout is only 500usec long. As a result, it is required to change the
trigger timeout on the *other* side of a PLIP connection, to about
10^6/HZ usecs. If both sides of a PLIP connection are used in IRQ-less mode,
this timeout is required on both sides.
It appears that in practice, the trigger timeout can be shorter than in the
above calculation. It isn't an important issue, unless the wire is faulty,
in which case a long timeout would stall the machine when, for whatever
reason, bits are dropped.
A utility that can perform this change in Linux is plipconfig, which is part
of the net-tools package (its location can be found in the
Documentation/Changes file). An example command would be
'plipconfig plipX trigger 10000', where plipX is the appropriate
PLIP device.
PLIP hardware interconnection
-----------------------------
PLIP uses several different data transfer methods. The first (and the
only one implemented in the early version of the code) uses a standard
printer "null" cable to transfer data four bits at a time using
data bit outputs connected to status bit inputs.
The second data transfer method relies on both machines having
bi-directional parallel ports, rather than output-only ``printer''
ports. This allows byte-wide transfers and avoids reconstructing
nibbles into bytes, leading to much faster transfers.
Parallel Transfer Mode 0 Cable
==============================
The cable for the first transfer mode is a standard
printer "null" cable which transfers data four bits at a time using
data bit outputs of the first port (machine T) connected to the
status bit inputs of the second port (machine R). There are five
status inputs, and they are used as four data inputs and a clock (data
strobe) input, arranged so that the data input bits appear as contiguous
bits with standard status register implementation.
A cable that implements this protocol is available commercially as a
"Null Printer" or "Turbo Laplink" cable. It can be constructed with
two DB-25 male connectors symmetrically connected as follows:
STROBE output 1*
D0->ERROR 2 - 15 15 - 2
D1->SLCT 3 - 13 13 - 3
D2->PAPOUT 4 - 12 12 - 4
D3->ACK 5 - 10 10 - 5
D4->BUSY 6 - 11 11 - 6
D5,D6,D7 are 7*, 8*, 9*
AUTOFD output 14*
INIT output 16*
SLCTIN 17 - 17
extra grounds are 18*,19*,20*,21*,22*,23*,24*
GROUND 25 - 25
* Do not connect these pins on either end
If the cable you are using has a metallic shield it should be
connected to the metallic DB-25 shell at one end only.
Parallel Transfer Mode 1
========================
The second data transfer method relies on both machines having
bi-directional parallel ports, rather than output-only ``printer''
ports. This allows byte-wide transfers, and avoids reconstructing
nibbles into bytes. This cable should not be used on unidirectional
``printer'' (as opposed to ``parallel'') ports or when the machine
isn't configured for PLIP, as it will result in output driver
conflicts and the (unlikely) possibility of damage.
The cable for this transfer mode should be constructed as follows:
STROBE->BUSY 1 - 11
D0->D0 2 - 2
D1->D1 3 - 3
D2->D2 4 - 4
D3->D3 5 - 5
D4->D4 6 - 6
D5->D5 7 - 7
D6->D6 8 - 8
D7->D7 9 - 9
INIT -> ACK 16 - 10
AUTOFD->PAPOUT 14 - 12
SLCT->SLCTIN 13 - 17
GND->ERROR 18 - 15
extra grounds are 19*,20*,21*,22*,23*,24*
GROUND 25 - 25
* Do not connect these pins on either end
Once again, if the cable you are using has a metallic shield it should
be connected to the metallic DB-25 shell at one end only.
PLIP Mode 0 transfer protocol
=============================
The PLIP driver is compatible with the "Crynwr" parallel port transfer
standard in Mode 0. That standard specifies the following protocol:
send header nibble '0x8'
count-low octet
count-high octet
... data octets
checksum octet
Each octet is sent as
<wait for rx. '0x1?'> <send 0x10+(octet&0x0F)>
<wait for rx. '0x0?'> <send 0x00+((octet>>4)&0x0F)>
To start a transfer the transmitting machine outputs a nibble 0x08.
That raises the ACK line, triggering an interrupt in the receiving
machine. The receiving machine disables interrupts and raises its own ACK
line.
Restated:
(OUT is bit 0-4, OUT.j is bit j from OUT. IN likewise)
Send_Byte:
OUT := low nibble, OUT.4 := 1
WAIT FOR IN.4 = 1
OUT := high nibble, OUT.4 := 0
WAIT FOR IN.4 = 0

View file

@ -0,0 +1,293 @@
Intel(R) PRO/Wireless 2100 Driver for Linux in support of:
Intel(R) PRO/Wireless 2100 Network Connection
Copyright (C) 2003-2006, Intel Corporation
README.ipw2100
Version: git-1.1.5
Date : January 25, 2006
Index
-----------------------------------------------
0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
1. Introduction
2. Release git-1.1.5 Current Features
3. Command Line Parameters
4. Sysfs Helper Files
5. Radio Kill Switch
6. Dynamic Firmware
7. Power Management
8. Support
9. License
0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
-----------------------------------------------
Important Notice FOR ALL USERS OR DISTRIBUTORS!!!!
Intel wireless LAN adapters are engineered, manufactured, tested, and
quality checked to ensure that they meet all necessary local and
governmental regulatory agency requirements for the regions that they
are designated and/or marked to ship into. Since wireless LANs are
generally unlicensed devices that share spectrum with radars,
satellites, and other licensed and unlicensed devices, it is sometimes
necessary to dynamically detect, avoid, and limit usage to avoid
interference with these devices. In many instances Intel is required to
provide test data to prove regional and local compliance to regional and
governmental regulations before certification or approval to use the
product is granted. Intel's wireless LAN's EEPROM, firmware, and
software driver are designed to carefully control parameters that affect
radio operation and to ensure electromagnetic compliance (EMC). These
parameters include, without limitation, RF power, spectrum usage,
channel scanning, and human exposure.
For these reasons Intel cannot permit any manipulation by third parties
of the software provided in binary format with the wireless WLAN
adapters (e.g., the EEPROM and firmware). Furthermore, if you use any
patches, utilities, or code with the Intel wireless LAN adapters that
have been manipulated by an unauthorized party (i.e., patches,
utilities, or code (including open source code modifications) which have
not been validated by Intel), (i) you will be solely responsible for
ensuring the regulatory compliance of the products, (ii) Intel will bear
no liability, under any theory of liability for any issues associated
with the modified products, including without limitation, claims under
the warranty and/or issues arising from regulatory non-compliance, and
(iii) Intel will not provide or be required to assist in providing
support to any third parties for such modified products.
Note: Many regulatory agencies consider Wireless LAN adapters to be
modules, and accordingly, condition system-level regulatory approval
upon receipt and review of test data documenting that the antennas and
system configuration do not cause the EMC and radio operation to be
non-compliant.
The drivers available for download from SourceForge are provided as a
part of a development project. Conformance to local regulatory
requirements is the responsibility of the individual developer. As
such, if you are interested in deploying or shipping a driver as part of
solution intended to be used for purposes other than development, please
obtain a tested driver from Intel Customer Support at:
http://www.intel.com/support/wireless/sb/CS-006408.htm
1. Introduction
-----------------------------------------------
This document provides a brief overview of the features supported by the
IPW2100 driver project. The main project website, where the latest
development version of the driver can be found, is:
http://ipw2100.sourceforge.net
There you can find the not only the latest releases, but also information about
potential fixes and patches, as well as links to the development mailing list
for the driver project.
2. Release git-1.1.5 Current Supported Features
-----------------------------------------------
- Managed (BSS) and Ad-Hoc (IBSS)
- WEP (shared key and open)
- Wireless Tools support
- 802.1x (tested with XSupplicant 1.0.1)
Enabled (but not supported) features:
- Monitor/RFMon mode
- WPA/WPA2
The distinction between officially supported and enabled is a reflection
on the amount of validation and interoperability testing that has been
performed on a given feature.
3. Command Line Parameters
-----------------------------------------------
If the driver is built as a module, the following optional parameters are used
by entering them on the command line with the modprobe command using this
syntax:
modprobe ipw2100 [<option>=<VAL1><,VAL2>...]
For example, to disable the radio on driver loading, enter:
modprobe ipw2100 disable=1
The ipw2100 driver supports the following module parameters:
Name Value Example:
debug 0x0-0xffffffff debug=1024
mode 0,1,2 mode=1 /* AdHoc */
channel int channel=3 /* Only valid in AdHoc or Monitor */
associate boolean associate=0 /* Do NOT auto associate */
disable boolean disable=1 /* Do not power the HW */
4. Sysfs Helper Files
---------------------------
-----------------------------------------------
There are several ways to control the behavior of the driver. Many of the
general capabilities are exposed through the Wireless Tools (iwconfig). There
are a few capabilities that are exposed through entries in the Linux Sysfs.
----- Driver Level ------
For the driver level files, look in /sys/bus/pci/drivers/ipw2100/
debug_level
This controls the same global as the 'debug' module parameter. For
information on the various debugging levels available, run the 'dvals'
script found in the driver source directory.
NOTE: 'debug_level' is only enabled if CONFIG_IPW2100_DEBUG is turn
on.
----- Device Level ------
For the device level files look in
/sys/bus/pci/drivers/ipw2100/{PCI-ID}/
For example:
/sys/bus/pci/drivers/ipw2100/0000:02:01.0
For the device level files, see /sys/bus/pci/drivers/ipw2100:
rf_kill
read -
0 = RF kill not enabled (radio on)
1 = SW based RF kill active (radio off)
2 = HW based RF kill active (radio off)
3 = Both HW and SW RF kill active (radio off)
write -
0 = If SW based RF kill active, turn the radio back on
1 = If radio is on, activate SW based RF kill
NOTE: If you enable the SW based RF kill and then toggle the HW
based RF kill from ON -> OFF -> ON, the radio will NOT come back on
5. Radio Kill Switch
-----------------------------------------------
Most laptops provide the ability for the user to physically disable the radio.
Some vendors have implemented this as a physical switch that requires no
software to turn the radio off and on. On other laptops, however, the switch
is controlled through a button being pressed and a software driver then making
calls to turn the radio off and on. This is referred to as a "software based
RF kill switch"
See the Sysfs helper file 'rf_kill' for determining the state of the RF switch
on your system.
6. Dynamic Firmware
-----------------------------------------------
As the firmware is licensed under a restricted use license, it can not be
included within the kernel sources. To enable the IPW2100 you will need a
firmware image to load into the wireless NIC's processors.
You can obtain these images from <http://ipw2100.sf.net/firmware.php>.
See INSTALL for instructions on installing the firmware.
7. Power Management
-----------------------------------------------
The IPW2100 supports the configuration of the Power Save Protocol
through a private wireless extension interface. The IPW2100 supports
the following different modes:
off No power management. Radio is always on.
on Automatic power management
1-5 Different levels of power management. The higher the
number the greater the power savings, but with an impact to
packet latencies.
Power management works by powering down the radio after a certain
interval of time has passed where no packets are passed through the
radio. Once powered down, the radio remains in that state for a given
period of time. For higher power savings, the interval between last
packet processed to sleep is shorter and the sleep period is longer.
When the radio is asleep, the access point sending data to the station
must buffer packets at the AP until the station wakes up and requests
any buffered packets. If you have an AP that does not correctly support
the PSP protocol you may experience packet loss or very poor performance
while power management is enabled. If this is the case, you will need
to try and find a firmware update for your AP, or disable power
management (via `iwconfig eth1 power off`)
To configure the power level on the IPW2100 you use a combination of
iwconfig and iwpriv. iwconfig is used to turn power management on, off,
and set it to auto.
iwconfig eth1 power off Disables radio power down
iwconfig eth1 power on Enables radio power management to
last set level (defaults to AUTO)
iwpriv eth1 set_power 0 Sets power level to AUTO and enables
power management if not previously
enabled.
iwpriv eth1 set_power 1-5 Set the power level as specified,
enabling power management if not
previously enabled.
You can view the current power level setting via:
iwpriv eth1 get_power
It will return the current period or timeout that is configured as a string
in the form of xxxx/yyyy (z) where xxxx is the timeout interval (amount of
time after packet processing), yyyy is the period to sleep (amount of time to
wait before powering the radio and querying the access point for buffered
packets), and z is the 'power level'. If power management is turned off the
xxxx/yyyy will be replaced with 'off' -- the level reported will be the active
level if `iwconfig eth1 power on` is invoked.
8. Support
-----------------------------------------------
For general development information and support,
go to:
http://ipw2100.sf.net/
The ipw2100 1.1.0 driver and firmware can be downloaded from:
http://support.intel.com
For installation support on the ipw2100 1.1.0 driver on Linux kernels
2.6.8 or greater, email support is available from:
http://supportmail.intel.com
9. License
-----------------------------------------------
Copyright(c) 2003 - 2006 Intel Corporation. All rights reserved.
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License (version 2) as
published by the Free Software Foundation.
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59
Temple Place - Suite 330, Boston, MA 02111-1307, USA.
The full GNU General Public License is included in this distribution in the
file called LICENSE.
License Contact Information:
James P. Ketrenos <ipw2100-admin@linux.intel.com>
Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497

View file

@ -0,0 +1,472 @@
Intel(R) PRO/Wireless 2915ABG Driver for Linux in support of:
Intel(R) PRO/Wireless 2200BG Network Connection
Intel(R) PRO/Wireless 2915ABG Network Connection
Note: The Intel(R) PRO/Wireless 2915ABG Driver for Linux and Intel(R)
PRO/Wireless 2200BG Driver for Linux is a unified driver that works on
both hardware adapters listed above. In this document the Intel(R)
PRO/Wireless 2915ABG Driver for Linux will be used to reference the
unified driver.
Copyright (C) 2004-2006, Intel Corporation
README.ipw2200
Version: 1.1.2
Date : March 30, 2006
Index
-----------------------------------------------
0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
1. Introduction
1.1. Overview of features
1.2. Module parameters
1.3. Wireless Extension Private Methods
1.4. Sysfs Helper Files
1.5. Supported channels
2. Ad-Hoc Networking
3. Interacting with Wireless Tools
3.1. iwconfig mode
3.2. iwconfig sens
4. About the Version Numbers
5. Firmware installation
6. Support
7. License
0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
-----------------------------------------------
Important Notice FOR ALL USERS OR DISTRIBUTORS!!!!
Intel wireless LAN adapters are engineered, manufactured, tested, and
quality checked to ensure that they meet all necessary local and
governmental regulatory agency requirements for the regions that they
are designated and/or marked to ship into. Since wireless LANs are
generally unlicensed devices that share spectrum with radars,
satellites, and other licensed and unlicensed devices, it is sometimes
necessary to dynamically detect, avoid, and limit usage to avoid
interference with these devices. In many instances Intel is required to
provide test data to prove regional and local compliance to regional and
governmental regulations before certification or approval to use the
product is granted. Intel's wireless LAN's EEPROM, firmware, and
software driver are designed to carefully control parameters that affect
radio operation and to ensure electromagnetic compliance (EMC). These
parameters include, without limitation, RF power, spectrum usage,
channel scanning, and human exposure.
For these reasons Intel cannot permit any manipulation by third parties
of the software provided in binary format with the wireless WLAN
adapters (e.g., the EEPROM and firmware). Furthermore, if you use any
patches, utilities, or code with the Intel wireless LAN adapters that
have been manipulated by an unauthorized party (i.e., patches,
utilities, or code (including open source code modifications) which have
not been validated by Intel), (i) you will be solely responsible for
ensuring the regulatory compliance of the products, (ii) Intel will bear
no liability, under any theory of liability for any issues associated
with the modified products, including without limitation, claims under
the warranty and/or issues arising from regulatory non-compliance, and
(iii) Intel will not provide or be required to assist in providing
support to any third parties for such modified products.
Note: Many regulatory agencies consider Wireless LAN adapters to be
modules, and accordingly, condition system-level regulatory approval
upon receipt and review of test data documenting that the antennas and
system configuration do not cause the EMC and radio operation to be
non-compliant.
The drivers available for download from SourceForge are provided as a
part of a development project. Conformance to local regulatory
requirements is the responsibility of the individual developer. As
such, if you are interested in deploying or shipping a driver as part of
solution intended to be used for purposes other than development, please
obtain a tested driver from Intel Customer Support at:
http://support.intel.com
1. Introduction
-----------------------------------------------
The following sections attempt to provide a brief introduction to using
the Intel(R) PRO/Wireless 2915ABG Driver for Linux.
This document is not meant to be a comprehensive manual on
understanding or using wireless technologies, but should be sufficient
to get you moving without wires on Linux.
For information on building and installing the driver, see the INSTALL
file.
1.1. Overview of Features
-----------------------------------------------
The current release (1.1.2) supports the following features:
+ BSS mode (Infrastructure, Managed)
+ IBSS mode (Ad-Hoc)
+ WEP (OPEN and SHARED KEY mode)
+ 802.1x EAP via wpa_supplicant and xsupplicant
+ Wireless Extension support
+ Full B and G rate support (2200 and 2915)
+ Full A rate support (2915 only)
+ Transmit power control
+ S state support (ACPI suspend/resume)
The following features are currently enabled, but not officially
supported:
+ WPA
+ long/short preamble support
+ Monitor mode (aka RFMon)
The distinction between officially supported and enabled is a reflection
on the amount of validation and interoperability testing that has been
performed on a given feature.
1.2. Command Line Parameters
-----------------------------------------------
Like many modules used in the Linux kernel, the Intel(R) PRO/Wireless
2915ABG Driver for Linux allows configuration options to be provided
as module parameters. The most common way to specify a module parameter
is via the command line.
The general form is:
% modprobe ipw2200 parameter=value
Where the supported parameter are:
associate
Set to 0 to disable the auto scan-and-associate functionality of the
driver. If disabled, the driver will not attempt to scan
for and associate to a network until it has been configured with
one or more properties for the target network, for example configuring
the network SSID. Default is 0 (do not auto-associate)
Example: % modprobe ipw2200 associate=0
auto_create
Set to 0 to disable the auto creation of an Ad-Hoc network
matching the channel and network name parameters provided.
Default is 1.
channel
channel number for association. The normal method for setting
the channel would be to use the standard wireless tools
(i.e. `iwconfig eth1 channel 10`), but it is useful sometimes
to set this while debugging. Channel 0 means 'ANY'
debug
If using a debug build, this is used to control the amount of debug
info is logged. See the 'dvals' and 'load' script for more info on
how to use this (the dvals and load scripts are provided as part
of the ipw2200 development snapshot releases available from the
SourceForge project at http://ipw2200.sf.net)
led
Can be used to turn on experimental LED code.
0 = Off, 1 = On. Default is 1.
mode
Can be used to set the default mode of the adapter.
0 = Managed, 1 = Ad-Hoc, 2 = Monitor
1.3. Wireless Extension Private Methods
-----------------------------------------------
As an interface designed to handle generic hardware, there are certain
capabilities not exposed through the normal Wireless Tool interface. As
such, a provision is provided for a driver to declare custom, or
private, methods. The Intel(R) PRO/Wireless 2915ABG Driver for Linux
defines several of these to configure various settings.
The general form of using the private wireless methods is:
% iwpriv $IFNAME method parameters
Where $IFNAME is the interface name the device is registered with
(typically eth1, customized via one of the various network interface
name managers, such as ifrename)
The supported private methods are:
get_mode
Can be used to report out which IEEE mode the driver is
configured to support. Example:
% iwpriv eth1 get_mode
eth1 get_mode:802.11bg (6)
set_mode
Can be used to configure which IEEE mode the driver will
support.
Usage:
% iwpriv eth1 set_mode {mode}
Where {mode} is a number in the range 1-7:
1 802.11a (2915 only)
2 802.11b
3 802.11ab (2915 only)
4 802.11g
5 802.11ag (2915 only)
6 802.11bg
7 802.11abg (2915 only)
get_preamble
Can be used to report configuration of preamble length.
set_preamble
Can be used to set the configuration of preamble length:
Usage:
% iwpriv eth1 set_preamble {mode}
Where {mode} is one of:
1 Long preamble only
0 Auto (long or short based on connection)
1.4. Sysfs Helper Files:
-----------------------------------------------
The Linux kernel provides a pseudo file system that can be used to
access various components of the operating system. The Intel(R)
PRO/Wireless 2915ABG Driver for Linux exposes several configuration
parameters through this mechanism.
An entry in the sysfs can support reading and/or writing. You can
typically query the contents of a sysfs entry through the use of cat,
and can set the contents via echo. For example:
% cat /sys/bus/pci/drivers/ipw2200/debug_level
Will report the current debug level of the driver's logging subsystem
(only available if CONFIG_IPW2200_DEBUG was configured when the driver
was built).
You can set the debug level via:
% echo $VALUE > /sys/bus/pci/drivers/ipw2200/debug_level
Where $VALUE would be a number in the case of this sysfs entry. The
input to sysfs files does not have to be a number. For example, the
firmware loader used by hotplug utilizes sysfs entries for transferring
the firmware image from user space into the driver.
The Intel(R) PRO/Wireless 2915ABG Driver for Linux exposes sysfs entries
at two levels -- driver level, which apply to all instances of the driver
(in the event that there are more than one device installed) and device
level, which applies only to the single specific instance.
1.4.1 Driver Level Sysfs Helper Files
-----------------------------------------------
For the driver level files, look in /sys/bus/pci/drivers/ipw2200/
debug_level
This controls the same global as the 'debug' module parameter
1.4.2 Device Level Sysfs Helper Files
-----------------------------------------------
For the device level files, look in
/sys/bus/pci/drivers/ipw2200/{PCI-ID}/
For example:
/sys/bus/pci/drivers/ipw2200/0000:02:01.0
For the device level files, see /sys/bus/pci/drivers/ipw2200:
rf_kill
read -
0 = RF kill not enabled (radio on)
1 = SW based RF kill active (radio off)
2 = HW based RF kill active (radio off)
3 = Both HW and SW RF kill active (radio off)
write -
0 = If SW based RF kill active, turn the radio back on
1 = If radio is on, activate SW based RF kill
NOTE: If you enable the SW based RF kill and then toggle the HW
based RF kill from ON -> OFF -> ON, the radio will NOT come back on
ucode
read-only access to the ucode version number
led
read -
0 = LED code disabled
1 = LED code enabled
write -
0 = Disable LED code
1 = Enable LED code
NOTE: The LED code has been reported to hang some systems when
running ifconfig and is therefore disabled by default.
1.5. Supported channels
-----------------------------------------------
Upon loading the Intel(R) PRO/Wireless 2915ABG Driver for Linux, a
message stating the detected geography code and the number of 802.11
channels supported by the card will be displayed in the log.
The geography code corresponds to a regulatory domain as shown in the
table below.
Supported channels
Code Geography 802.11bg 802.11a
--- Restricted 11 0
ZZF Custom US/Canada 11 8
ZZD Rest of World 13 0
ZZA Custom USA & Europe & High 11 13
ZZB Custom NA & Europe 11 13
ZZC Custom Japan 11 4
ZZM Custom 11 0
ZZE Europe 13 19
ZZJ Custom Japan 14 4
ZZR Rest of World 14 0
ZZH High Band 13 4
ZZG Custom Europe 13 4
ZZK Europe 13 24
ZZL Europe 11 13
2. Ad-Hoc Networking
-----------------------------------------------
When using a device in an Ad-Hoc network, it is useful to understand the
sequence and requirements for the driver to be able to create, join, or
merge networks.
The following attempts to provide enough information so that you can
have a consistent experience while using the driver as a member of an
Ad-Hoc network.
2.1. Joining an Ad-Hoc Network
-----------------------------------------------
The easiest way to get onto an Ad-Hoc network is to join one that
already exists.
2.2. Creating an Ad-Hoc Network
-----------------------------------------------
An Ad-Hoc networks is created using the syntax of the Wireless tool.
For Example:
iwconfig eth1 mode ad-hoc essid testing channel 2
2.3. Merging Ad-Hoc Networks
-----------------------------------------------
3. Interaction with Wireless Tools
-----------------------------------------------
3.1 iwconfig mode
-----------------------------------------------
When configuring the mode of the adapter, all run-time configured parameters
are reset to the value used when the module was loaded. This includes
channels, rates, ESSID, etc.
3.2 iwconfig sens
-----------------------------------------------
The 'iwconfig ethX sens XX' command will not set the signal sensitivity
threshold, as described in iwconfig documentation, but rather the number
of consecutive missed beacons that will trigger handover, i.e. roaming
to another access point. At the same time, it will set the disassociation
threshold to 3 times the given value.
4. About the Version Numbers
-----------------------------------------------
Due to the nature of open source development projects, there are
frequently changes being incorporated that have not gone through
a complete validation process. These changes are incorporated into
development snapshot releases.
Releases are numbered with a three level scheme:
major.minor.development
Any version where the 'development' portion is 0 (for example
1.0.0, 1.1.0, etc.) indicates a stable version that will be made
available for kernel inclusion.
Any version where the 'development' portion is not a 0 (for
example 1.0.1, 1.1.5, etc.) indicates a development version that is
being made available for testing and cutting edge users. The stability
and functionality of the development releases are not know. We make
efforts to try and keep all snapshots reasonably stable, but due to the
frequency of their release, and the desire to get those releases
available as quickly as possible, unknown anomalies should be expected.
The major version number will be incremented when significant changes
are made to the driver. Currently, there are no major changes planned.
5. Firmware installation
----------------------------------------------
The driver requires a firmware image, download it and extract the
files under /lib/firmware (or wherever your hotplug's firmware.agent
will look for firmware files)
The firmware can be downloaded from the following URL:
http://ipw2200.sf.net/
6. Support
-----------------------------------------------
For direct support of the 1.0.0 version, you can contact
http://supportmail.intel.com, or you can use the open source project
support.
For general information and support, go to:
http://ipw2200.sf.net/
7. License
-----------------------------------------------
Copyright(c) 2003 - 2006 Intel Corporation. All rights reserved.
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License version 2 as
published by the Free Software Foundation.
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59
Temple Place - Suite 330, Boston, MA 02111-1307, USA.
The full GNU General Public License is included in this distribution in the
file called LICENSE.
Contact Information:
James P. Ketrenos <ipw2100-admin@linux.intel.com>
Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497

View file

@ -0,0 +1,207 @@
sb1000 is a module network device driver for the General Instrument (also known
as NextLevel) SURFboard1000 internal cable modem board. This is an ISA card
which is used by a number of cable TV companies to provide cable modem access.
It's a one-way downstream-only cable modem, meaning that your upstream net link
is provided by your regular phone modem.
This driver was written by Franco Venturi <fventuri@mediaone.net>. He deserves
a great deal of thanks for this wonderful piece of code!
-----------------------------------------------------------------------------
Support for this device is now a part of the standard Linux kernel. The
driver source code file is drivers/net/sb1000.c. In addition to this
you will need:
1.) The "cmconfig" program. This is a utility which supplements "ifconfig"
to configure the cable modem and network interface (usually called "cm0");
and
2.) Several PPP scripts which live in /etc/ppp to make connecting via your
cable modem easy.
These utilities can be obtained from:
http://www.jacksonville.net/~fventuri/
in Franco's original source code distribution .tar.gz file. Support for
the sb1000 driver can be found at:
http://web.archive.org/web/*/http://home.adelphia.net/~siglercm/sb1000.html
http://web.archive.org/web/*/http://linuxpower.cx/~cable/
along with these utilities.
3.) The standard isapnp tools. These are necessary to configure your SB1000
card at boot time (or afterwards by hand) since it's a PnP card.
If you don't have these installed as a standard part of your Linux
distribution, you can find them at:
http://www.roestock.demon.co.uk/isapnptools/
or check your Linux distribution binary CD or their web site. For help with
isapnp, pnpdump, or /etc/isapnp.conf, go to:
http://www.roestock.demon.co.uk/isapnptools/isapnpfaq.html
-----------------------------------------------------------------------------
To make the SB1000 card work, follow these steps:
1.) Run `make config', or `make menuconfig', or `make xconfig', whichever
you prefer, in the top kernel tree directory to set up your kernel
configuration. Make sure to say "Y" to "Prompt for development drivers"
and to say "M" to the sb1000 driver. Also say "Y" or "M" to all the standard
networking questions to get TCP/IP and PPP networking support.
2.) *BEFORE* you build the kernel, edit drivers/net/sb1000.c. Make sure
to redefine the value of READ_DATA_PORT to match the I/O address used
by isapnp to access your PnP cards. This is the value of READPORT in
/etc/isapnp.conf or given by the output of pnpdump.
3.) Build and install the kernel and modules as usual.
4.) Boot your new kernel following the usual procedures.
5.) Set up to configure the new SB1000 PnP card by capturing the output
of "pnpdump" to a file and editing this file to set the correct I/O ports,
IRQ, and DMA settings for all your PnP cards. Make sure none of the settings
conflict with one another. Then test this configuration by running the
"isapnp" command with your new config file as the input. Check for
errors and fix as necessary. (As an aside, I use I/O ports 0x110 and
0x310 and IRQ 11 for my SB1000 card and these work well for me. YMMV.)
Then save the finished config file as /etc/isapnp.conf for proper configuration
on subsequent reboots.
6.) Download the original file sb1000-1.1.2.tar.gz from Franco's site or one of
the others referenced above. As root, unpack it into a temporary directory and
do a `make cmconfig' and then `install -c cmconfig /usr/local/sbin'. Don't do
`make install' because it expects to find all the utilities built and ready for
installation, not just cmconfig.
7.) As root, copy all the files under the ppp/ subdirectory in Franco's
tar file into /etc/ppp, being careful not to overwrite any files that are
already in there. Then modify ppp@gi-on to set the correct login name,
phone number, and frequency for the cable modem. Also edit pap-secrets
to specify your login name and password and any site-specific information
you need.
8.) Be sure to modify /etc/ppp/firewall to use ipchains instead of
the older ipfwadm commands from the 2.0.x kernels. There's a neat utility to
convert ipfwadm commands to ipchains commands:
http://users.dhp.com/~whisper/ipfwadm2ipchains/
You may also wish to modify the firewall script to implement a different
firewalling scheme.
9.) Start the PPP connection via the script /etc/ppp/ppp@gi-on. You must be
root to do this. It's better to use a utility like sudo to execute
frequently used commands like this with root permissions if possible. If you
connect successfully the cable modem interface will come up and you'll see a
driver message like this at the console:
cm0: sb1000 at (0x110,0x310), csn 1, S/N 0x2a0d16d8, IRQ 11.
sb1000.c:v1.1.2 6/01/98 (fventuri@mediaone.net)
The "ifconfig" command should show two new interfaces, ppp0 and cm0.
The command "cmconfig cm0" will give you information about the cable modem
interface.
10.) Try pinging a site via `ping -c 5 www.yahoo.com', for example. You should
see packets received.
11.) If you can't get site names (like www.yahoo.com) to resolve into
IP addresses (like 204.71.200.67), be sure your /etc/resolv.conf file
has no syntax errors and has the right nameserver IP addresses in it.
If this doesn't help, try something like `ping -c 5 204.71.200.67' to
see if the networking is running but the DNS resolution is where the
problem lies.
12.) If you still have problems, go to the support web sites mentioned above
and read the information and documentation there.
-----------------------------------------------------------------------------
Common problems:
1.) Packets go out on the ppp0 interface but don't come back on the cm0
interface. It looks like I'm connected but I can't even ping any
numerical IP addresses. (This happens predominantly on Debian systems due
to a default boot-time configuration script.)
Solution -- As root `echo 0 > /proc/sys/net/ipv4/conf/cm0/rp_filter' so it
can share the same IP address as the ppp0 interface. Note that this
command should probably be added to the /etc/ppp/cablemodem script
*right*between* the "/sbin/ifconfig" and "/sbin/cmconfig" commands.
You may need to do this to /proc/sys/net/ipv4/conf/ppp0/rp_filter as well.
If you do this to /proc/sys/net/ipv4/conf/default/rp_filter on each reboot
(in rc.local or some such) then any interfaces can share the same IP
addresses.
2.) I get "unresolved symbol" error messages on executing `insmod sb1000.o'.
Solution -- You probably have a non-matching kernel source tree and
/usr/include/linux and /usr/include/asm header files. Make sure you
install the correct versions of the header files in these two directories.
Then rebuild and reinstall the kernel.
3.) When isapnp runs it reports an error, and my SB1000 card isn't working.
Solution -- There's a problem with later versions of isapnp using the "(CHECK)"
option in the lines that allocate the two I/O addresses for the SB1000 card.
This first popped up on RH 6.0. Delete "(CHECK)" for the SB1000 I/O addresses.
Make sure they don't conflict with any other pieces of hardware first! Then
rerun isapnp and go from there.
4.) I can't execute the /etc/ppp/ppp@gi-on file.
Solution -- As root do `chmod ug+x /etc/ppp/ppp@gi-on'.
5.) The firewall script isn't working (with 2.2.x and higher kernels).
Solution -- Use the ipfwadm2ipchains script referenced above to convert the
/etc/ppp/firewall script from the deprecated ipfwadm commands to ipchains.
6.) I'm getting *tons* of firewall deny messages in the /var/kern.log,
/var/messages, and/or /var/syslog files, and they're filling up my /var
partition!!!
Solution -- First, tell your ISP that you're receiving DoS (Denial of Service)
and/or portscanning (UDP connection attempts) attacks! Look over the deny
messages to figure out what the attack is and where it's coming from. Next,
edit /etc/ppp/cablemodem and make sure the ",nobroadcast" option is turned on
to the "cmconfig" command (uncomment that line). If you're not receiving these
denied packets on your broadcast interface (IP address xxx.yyy.zzz.255
typically), then someone is attacking your machine in particular. Be careful
out there....
7.) Everything seems to work fine but my computer locks up after a while
(and typically during a lengthy download through the cable modem)!
Solution -- You may need to add a short delay in the driver to 'slow down' the
SURFboard because your PC might not be able to keep up with the transfer rate
of the SB1000. To do this, it's probably best to download Franco's
sb1000-1.1.2.tar.gz archive and build and install sb1000.o manually. You'll
want to edit the 'Makefile' and look for the 'SB1000_DELAY'
define. Uncomment those 'CFLAGS' lines (and comment out the default ones)
and try setting the delay to something like 60 microseconds with:
'-DSB1000_DELAY=60'. Then do `make' and as root `make install' and try
it out. If it still doesn't work or you like playing with the driver, you may
try other numbers. Remember though that the higher the delay, the slower the
driver (which slows down the rest of the PC too when it is actively
used). Thanks to Ed Daiga for this tip!
-----------------------------------------------------------------------------
Credits: This README came from Franco Venturi's original README file which is
still supplied with his driver .tar.gz archive. I and all other sb1000 users
owe Franco a tremendous "Thank you!" Additional thanks goes to Carl Patten
and Ralph Bonnell who are now managing the Linux SB1000 web site, and to
the SB1000 users who reported and helped debug the common problems listed
above.
Clemmitt Sigler
csigler@vt.edu

View file

@ -0,0 +1,40 @@
IP-Aliasing:
============
IP-aliases are an obsolete way to manage multiple IP-addresses/masks
per interface. Newer tools such as iproute2 support multiple
address/prefixes per interface, but aliases are still supported
for backwards compatibility.
An alias is formed by adding a colon and a string when running ifconfig.
This string is usually numeric, but this is not a must.
o Alias creation.
Alias creation is done by 'magic' interface naming: eg. to create a
200.1.1.1 alias for eth0 ...
# ifconfig eth0:0 200.1.1.1 etc,etc....
~~ -> request alias #0 creation (if not yet exists) for eth0
The corresponding route is also set up by this command.
Please note: The route always points to the base interface.
o Alias deletion.
The alias is removed by shutting the alias down:
# ifconfig eth0:0 down
~~~~~~~~~~ -> will delete alias
o Alias (re-)configuring
Aliases are not real devices, but programs should be able to configure and
refer to them as usual (ifconfig, route, etc).
o Relationship with main device
If the base device is shut down the added aliases will be deleted
too.

View file

@ -0,0 +1,263 @@
Altera Triple-Speed Ethernet MAC driver
Copyright (C) 2008-2014 Altera Corporation
This is the driver for the Altera Triple-Speed Ethernet (TSE) controllers
using the SGDMA and MSGDMA soft DMA IP components. The driver uses the
platform bus to obtain component resources. The designs used to test this
driver were built for a Cyclone(R) V SOC FPGA board, a Cyclone(R) V FPGA board,
and tested with ARM and NIOS processor hosts seperately. The anticipated use
cases are simple communications between an embedded system and an external peer
for status and simple configuration of the embedded system.
For more information visit www.altera.com and www.rocketboards.org. Support
forums for the driver may be found on www.rocketboards.org, and a design used
to test this driver may be found there as well. Support is also available from
the maintainer of this driver, found in MAINTAINERS.
The Triple-Speed Ethernet, SGDMA, and MSGDMA components are all soft IP
components that can be assembled and built into an FPGA using the Altera
Quartus toolchain. Quartus 13.1 and 14.0 were used to build the design that
this driver was tested against. The sopc2dts tool is used to create the
device tree for the driver, and may be found at rocketboards.org.
The driver probe function examines the device tree and determines if the
Triple-Speed Ethernet instance is using an SGDMA or MSGDMA component. The
probe function then installs the appropriate set of DMA routines to
initialize, setup transmits, receives, and interrupt handling primitives for
the respective configurations.
The SGDMA component is to be deprecated in the near future (over the next 1-2
years as of this writing in early 2014) in favor of the MSGDMA component.
SGDMA support is included for existing designs and reference in case a
developer wishes to support their own soft DMA logic and driver support. Any
new designs should not use the SGDMA.
The SGDMA supports only a single transmit or receive operation at a time, and
therefore will not perform as well compared to the MSGDMA soft IP. Please
visit www.altera.com for known, documented SGDMA errata.
Scatter-gather DMA is not supported by the SGDMA or MSGDMA at this time.
Scatter-gather DMA will be added to a future maintenance update to this
driver.
Jumbo frames are not supported at this time.
The driver limits PHY operations to 10/100Mbps, and has not yet been fully
tested for 1Gbps. This support will be added in a future maintenance update.
1) Kernel Configuration
The kernel configuration option is ALTERA_TSE:
Device Drivers ---> Network device support ---> Ethernet driver support --->
Altera Triple-Speed Ethernet MAC support (ALTERA_TSE)
2) Driver parameters list:
debug: message level (0: no output, 16: all);
dma_rx_num: Number of descriptors in the RX list (default is 64);
dma_tx_num: Number of descriptors in the TX list (default is 64).
3) Command line options
Driver parameters can be also passed in command line by using:
altera_tse=dma_rx_num:128,dma_tx_num:512
4) Driver information and notes
4.1) Transmit process
When the driver's transmit routine is called by the kernel, it sets up a
transmit descriptor by calling the underlying DMA transmit routine (SGDMA or
MSGDMA), and initites a transmit operation. Once the transmit is complete, an
interrupt is driven by the transmit DMA logic. The driver handles the transmit
completion in the context of the interrupt handling chain by recycling
resource required to send and track the requested transmit operation.
4.2) Receive process
The driver will post receive buffers to the receive DMA logic during driver
intialization. Receive buffers may or may not be queued depending upon the
underlying DMA logic (MSGDMA is able queue receive buffers, SGDMA is not able
to queue receive buffers to the SGDMA receive logic). When a packet is
received, the DMA logic generates an interrupt. The driver handles a receive
interrupt by obtaining the DMA receive logic status, reaping receive
completions until no more receive completions are available.
4.3) Interrupt Mitigation
The driver is able to mitigate the number of its DMA interrupts
using NAPI for receive operations. Interrupt mitigation is not yet supported
for transmit operations, but will be added in a future maintenance release.
4.4) Ethtool support
Ethtool is supported. Driver statistics and internal errors can be taken using:
ethtool -S ethX command. It is possible to dump registers etc.
4.5) PHY Support
The driver is compatible with PAL to work with PHY and GPHY devices.
4.7) List of source files:
o Kconfig
o Makefile
o altera_tse_main.c: main network device driver
o altera_tse_ethtool.c: ethtool support
o altera_tse.h: private driver structure and common definitions
o altera_msgdma.h: MSGDMA implementation function definitions
o altera_sgdma.h: SGDMA implementation function definitions
o altera_msgdma.c: MSGDMA implementation
o altera_sgdma.c: SGDMA implementation
o altera_sgdmahw.h: SGDMA register and descriptor definitions
o altera_msgdmahw.h: MSGDMA register and descriptor definitions
o altera_utils.c: Driver utility functions
o altera_utils.h: Driver utility function definitions
5) Debug Information
The driver exports debug information such as internal statistics,
debug information, MAC and DMA registers etc.
A user may use the ethtool support to get statistics:
e.g. using: ethtool -S ethX (that shows the statistics counters)
or sees the MAC registers: e.g. using: ethtool -d ethX
The developer can also use the "debug" module parameter to get
further debug information.
6) Statistics Support
The controller and driver support a mix of IEEE standard defined statistics,
RFC defined statistics, and driver or Altera defined statistics. The four
specifications containing the standard definitions for these statistics are
as follows:
o IEEE 802.3-2012 - IEEE Standard for Ethernet.
o RFC 2863 found at http://www.rfc-editor.org/rfc/rfc2863.txt.
o RFC 2819 found at http://www.rfc-editor.org/rfc/rfc2819.txt.
o Altera Triple Speed Ethernet User Guide, found at http://www.altera.com
The statistics supported by the TSE and the device driver are as follows:
"tx_packets" is equivalent to aFramesTransmittedOK defined in IEEE 802.3-2012,
Section 5.2.2.1.2. This statistics is the count of frames that are successfully
transmitted.
"rx_packets" is equivalent to aFramesReceivedOK defined in IEEE 802.3-2012,
Section 5.2.2.1.5. This statistic is the count of frames that are successfully
received. This count does not include any error packets such as CRC errors,
length errors, or alignment errors.
"rx_crc_errors" is equivalent to aFrameCheckSequenceErrors defined in IEEE
802.3-2012, Section 5.2.2.1.6. This statistic is the count of frames that are
an integral number of bytes in length and do not pass the CRC test as the frame
is received.
"rx_align_errors" is equivalent to aAlignmentErrors defined in IEEE 802.3-2012,
Section 5.2.2.1.7. This statistic is the count of frames that are not an
integral number of bytes in length and do not pass the CRC test as the frame is
received.
"tx_bytes" is equivalent to aOctetsTransmittedOK defined in IEEE 802.3-2012,
Section 5.2.2.1.8. This statistic is the count of data and pad bytes
successfully transmitted from the interface.
"rx_bytes" is equivalent to aOctetsReceivedOK defined in IEEE 802.3-2012,
Section 5.2.2.1.14. This statistic is the count of data and pad bytes
successfully received by the controller.
"tx_pause" is equivalent to aPAUSEMACCtrlFramesTransmitted defined in IEEE
802.3-2012, Section 30.3.4.2. This statistic is a count of PAUSE frames
transmitted from the network controller.
"rx_pause" is equivalent to aPAUSEMACCtrlFramesReceived defined in IEEE
802.3-2012, Section 30.3.4.3. This statistic is a count of PAUSE frames
received by the network controller.
"rx_errors" is equivalent to ifInErrors defined in RFC 2863. This statistic is
a count of the number of packets received containing errors that prevented the
packet from being delivered to a higher level protocol.
"tx_errors" is equivalent to ifOutErrors defined in RFC 2863. This statistic
is a count of the number of packets that could not be transmitted due to errors.
"rx_unicast" is equivalent to ifInUcastPkts defined in RFC 2863. This
statistic is a count of the number of packets received that were not addressed
to the broadcast address or a multicast group.
"rx_multicast" is equivalent to ifInMulticastPkts defined in RFC 2863. This
statistic is a count of the number of packets received that were addressed to
a multicast address group.
"rx_broadcast" is equivalent to ifInBroadcastPkts defined in RFC 2863. This
statistic is a count of the number of packets received that were addressed to
the broadcast address.
"tx_discards" is equivalent to ifOutDiscards defined in RFC 2863. This
statistic is the number of outbound packets not transmitted even though an
error was not detected. An example of a reason this might occur is to free up
internal buffer space.
"tx_unicast" is equivalent to ifOutUcastPkts defined in RFC 2863. This
statistic counts the number of packets transmitted that were not addressed to
a multicast group or broadcast address.
"tx_multicast" is equivalent to ifOutMulticastPkts defined in RFC 2863. This
statistic counts the number of packets transmitted that were addressed to a
multicast group.
"tx_broadcast" is equivalent to ifOutBroadcastPkts defined in RFC 2863. This
statistic counts the number of packets transmitted that were addressed to a
broadcast address.
"ether_drops" is equivalent to etherStatsDropEvents defined in RFC 2819.
This statistic counts the number of packets dropped due to lack of internal
controller resources.
"rx_total_bytes" is equivalent to etherStatsOctets defined in RFC 2819.
This statistic counts the total number of bytes received by the controller,
including error and discarded packets.
"rx_total_packets" is equivalent to etherStatsPkts defined in RFC 2819.
This statistic counts the total number of packets received by the controller,
including error, discarded, unicast, multicast, and broadcast packets.
"rx_undersize" is equivalent to etherStatsUndersizePkts defined in RFC 2819.
This statistic counts the number of correctly formed packets received less
than 64 bytes long.
"rx_oversize" is equivalent to etherStatsOversizePkts defined in RFC 2819.
This statistic counts the number of correctly formed packets greater than 1518
bytes long.
"rx_64_bytes" is equivalent to etherStatsPkts64Octets defined in RFC 2819.
This statistic counts the total number of packets received that were 64 octets
in length.
"rx_65_127_bytes" is equivalent to etherStatsPkts65to127Octets defined in RFC
2819. This statistic counts the total number of packets received that were
between 65 and 127 octets in length inclusive.
"rx_128_255_bytes" is equivalent to etherStatsPkts128to255Octets defined in
RFC 2819. This statistic is the total number of packets received that were
between 128 and 255 octets in length inclusive.
"rx_256_511_bytes" is equivalent to etherStatsPkts256to511Octets defined in
RFC 2819. This statistic is the total number of packets received that were
between 256 and 511 octets in length inclusive.
"rx_512_1023_bytes" is equivalent to etherStatsPkts512to1023Octets defined in
RFC 2819. This statistic is the total number of packets received that were
between 512 and 1023 octets in length inclusive.
"rx_1024_1518_bytes" is equivalent to etherStatsPkts1024to1518Octets define
in RFC 2819. This statistic is the total number of packets received that were
between 1024 and 1518 octets in length inclusive.
"rx_gte_1519_bytes" is a statistic defined specific to the behavior of the
Altera TSE. This statistics counts the number of received good and errored
frames between the length of 1519 and the maximum frame length configured
in the frm_length register. See the Altera TSE User Guide for More details.
"rx_jabbers" is equivalent to etherStatsJabbers defined in RFC 2819. This
statistic is the total number of packets received that were longer than 1518
octets, and had either a bad CRC with an integral number of octets (CRC Error)
or a bad CRC with a non-integral number of octets (Alignment Error).
"rx_runts" is equivalent to etherStatsFragments defined in RFC 2819. This
statistic is the total number of packets received that were less than 64 octets
in length and had either a bad CRC with an integral number of octets (CRC
error) or a bad CRC with a non-integral number of octets (Alignment Error).

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,556 @@
----------------------------------------------------------------------------
NOTE: See also arcnet-hardware.txt in this directory for jumper-setting
and cabling information if you're like many of us and didn't happen to get a
manual with your ARCnet card.
----------------------------------------------------------------------------
Since no one seems to listen to me otherwise, perhaps a poem will get your
attention:
This driver's getting fat and beefy,
But my cat is still named Fifi.
Hmm, I think I'm allowed to call that a poem, even though it's only two
lines. Hey, I'm in Computer Science, not English. Give me a break.
The point is: I REALLY REALLY REALLY REALLY REALLY want to hear from you if
you test this and get it working. Or if you don't. Or anything.
ARCnet 0.32 ALPHA first made it into the Linux kernel 1.1.80 - this was
nice, but after that even FEWER people started writing to me because they
didn't even have to install the patch. <sigh>
Come on, be a sport! Send me a success report!
(hey, that was even better than my original poem... this is getting bad!)
--------
WARNING:
--------
If you don't e-mail me about your success/failure soon, I may be forced to
start SINGING. And we don't want that, do we?
(You know, it might be argued that I'm pushing this point a little too much.
If you think so, why not flame me in a quick little e-mail? Please also
include the type of card(s) you're using, software, size of network, and
whether it's working or not.)
My e-mail address is: apenwarr@worldvisions.ca
---------------------------------------------------------------------------
These are the ARCnet drivers for Linux.
This new release (2.91) has been put together by David Woodhouse
<dwmw2@infradead.org>, in an attempt to tidy up the driver after adding support
for yet another chipset. Now the generic support has been separated from the
individual chipset drivers, and the source files aren't quite so packed with
#ifdefs! I've changed this file a bit, but kept it in the first person from
Avery, because I didn't want to completely rewrite it.
The previous release resulted from many months of on-and-off effort from me
(Avery Pennarun), many bug reports/fixes and suggestions from others, and in
particular a lot of input and coding from Tomasz Motylewski. Starting with
ARCnet 2.10 ALPHA, Tomasz's all-new-and-improved RFC1051 support has been
included and seems to be working fine!
Where do I discuss these drivers?
---------------------------------
Tomasz has been so kind as to set up a new and improved mailing list.
Subscribe by sending a message with the BODY "subscribe linux-arcnet YOUR
REAL NAME" to listserv@tichy.ch.uj.edu.pl. Then, to submit messages to the
list, mail to linux-arcnet@tichy.ch.uj.edu.pl.
There are archives of the mailing list at:
http://epistolary.org/mailman/listinfo.cgi/arcnet
The people on linux-net@vger.kernel.org (now defunct, replaced by
netdev@vger.kernel.org) have also been known to be very helpful, especially
when we're talking about ALPHA Linux kernels that may or may not work right
in the first place.
Other Drivers and Info
----------------------
You can try my ARCNET page on the World Wide Web at:
http://www.qis.net/~jschmitz/arcnet/
Also, SMC (one of the companies that makes ARCnet cards) has a WWW site you
might be interested in, which includes several drivers for various cards
including ARCnet. Try:
http://www.smc.com/
Performance Technologies makes various network software that supports
ARCnet:
http://www.perftech.com/ or ftp to ftp.perftech.com.
Novell makes a networking stack for DOS which includes ARCnet drivers. Try
FTPing to ftp.novell.com.
You can get the Crynwr packet driver collection (including arcether.com, the
one you'll want to use with ARCnet cards) from
oak.oakland.edu:/simtel/msdos/pktdrvr. It won't work perfectly on a 386+
without patches, though, and also doesn't like several cards. Fixed
versions are available on my WWW page, or via e-mail if you don't have WWW
access.
Installing the Driver
---------------------
All you will need to do in order to install the driver is:
make config
(be sure to choose ARCnet in the network devices
and at least one chipset driver.)
make clean
make zImage
If you obtained this ARCnet package as an upgrade to the ARCnet driver in
your current kernel, you will need to first copy arcnet.c over the one in
the linux/drivers/net directory.
You will know the driver is installed properly if you get some ARCnet
messages when you reboot into the new Linux kernel.
There are four chipset options:
1. Standard ARCnet COM90xx chipset.
This is the normal ARCnet card, which you've probably got. This is the only
chipset driver which will autoprobe if not told where the card is.
It following options on the command line:
com90xx=[<io>[,<irq>[,<shmem>]]][,<name>] | <name>
If you load the chipset support as a module, the options are:
io=<io> irq=<irq> shmem=<shmem> device=<name>
To disable the autoprobe, just specify "com90xx=" on the kernel command line.
To specify the name alone, but allow autoprobe, just put "com90xx=<name>"
2. ARCnet COM20020 chipset.
This is the new chipset from SMC with support for promiscuous mode (packet
sniffing), extra diagnostic information, etc. Unfortunately, there is no
sensible method of autoprobing for these cards. You must specify the I/O
address on the kernel command line.
The command line options are:
com20020=<io>[,<irq>[,<node_ID>[,backplane[,CKP[,timeout]]]]][,name]
If you load the chipset support as a module, the options are:
io=<io> irq=<irq> node=<node_ID> backplane=<backplane> clock=<CKP>
timeout=<timeout> device=<name>
The COM20020 chipset allows you to set the node ID in software, overriding the
default which is still set in DIP switches on the card. If you don't have the
COM20020 data sheets, and you don't know what the other three options refer
to, then they won't interest you - forget them.
3. ARCnet COM90xx chipset in IO-mapped mode.
This will also work with the normal ARCnet cards, but doesn't use the shared
memory. It performs less well than the above driver, but is provided in case
you have a card which doesn't support shared memory, or (strangely) in case
you have so many ARCnet cards in your machine that you run out of shmem slots.
If you don't give the IO address on the kernel command line, then the driver
will not find the card.
The command line options are:
com90io=<io>[,<irq>][,<name>]
If you load the chipset support as a module, the options are:
io=<io> irq=<irq> device=<name>
4. ARCnet RIM I cards.
These are COM90xx chips which are _completely_ memory mapped. The support for
these is not tested. If you have one, please mail the author with a success
report. All options must be specified, except the device name.
Command line options:
arcrimi=<shmem>,<irq>,<node_ID>[,<name>]
If you load the chipset support as a module, the options are:
shmem=<shmem> irq=<irq> node=<node_ID> device=<name>
Loadable Module Support
-----------------------
Configure and rebuild Linux. When asked, answer 'm' to "Generic ARCnet
support" and to support for your ARCnet chipset if you want to use the
loadable module. You can also say 'y' to "Generic ARCnet support" and 'm'
to the chipset support if you wish.
make config
make clean
make zImage
make modules
If you're using a loadable module, you need to use insmod to load it, and
you can specify various characteristics of your card on the command
line. (In recent versions of the driver, autoprobing is much more reliable
and works as a module, so most of this is now unnecessary.)
For example:
cd /usr/src/linux/modules
insmod arcnet.o
insmod com90xx.o
insmod com20020.o io=0x2e0 device=eth1
Using the Driver
----------------
If you build your kernel with ARCnet COM90xx support included, it should
probe for your card automatically when you boot. If you use a different
chipset driver complied into the kernel, you must give the necessary options
on the kernel command line, as detailed above.
Go read the NET-2-HOWTO and ETHERNET-HOWTO for Linux; they should be
available where you picked up this driver. Think of your ARCnet as a
souped-up (or down, as the case may be) Ethernet card.
By the way, be sure to change all references from "eth0" to "arc0" in the
HOWTOs. Remember that ARCnet isn't a "true" Ethernet, and the device name
is DIFFERENT.
Multiple Cards in One Computer
------------------------------
Linux has pretty good support for this now, but since I've been busy, the
ARCnet driver has somewhat suffered in this respect. COM90xx support, if
compiled into the kernel, will (try to) autodetect all the installed cards.
If you have other cards, with support compiled into the kernel, then you can
just repeat the options on the kernel command line, e.g.:
LILO: linux com20020=0x2e0 com20020=0x380 com90io=0x260
If you have the chipset support built as a loadable module, then you need to
do something like this:
insmod -o arc0 com90xx
insmod -o arc1 com20020 io=0x2e0
insmod -o arc2 com90xx
The ARCnet drivers will now sort out their names automatically.
How do I get it to work with...?
--------------------------------
NFS: Should be fine linux->linux, just pretend you're using Ethernet cards.
oak.oakland.edu:/simtel/msdos/nfs has some nice DOS clients. There
is also a DOS-based NFS server called SOSS. It doesn't multitask
quite the way Linux does (actually, it doesn't multitask AT ALL) but
you never know what you might need.
With AmiTCP (and possibly others), you may need to set the following
options in your Amiga nfstab: MD 1024 MR 1024 MW 1024
(Thanks to Christian Gottschling <ferksy@indigo.tng.oche.de>
for this.)
Probably these refer to maximum NFS data/read/write block sizes. I
don't know why the defaults on the Amiga didn't work; write to me if
you know more.
DOS: If you're using the freeware arcether.com, you might want to install
the driver patch from my web page. It helps with PC/TCP, and also
can get arcether to load if it timed out too quickly during
initialization. In fact, if you use it on a 386+ you REALLY need
the patch, really.
Windows: See DOS :) Trumpet Winsock works fine with either the Novell or
Arcether client, assuming you remember to load winpkt of course.
LAN Manager and Windows for Workgroups: These programs use protocols that
are incompatible with the Internet standard. They try to pretend
the cards are Ethernet, and confuse everyone else on the network.
However, v2.00 and higher of the Linux ARCnet driver supports this
protocol via the 'arc0e' device. See the section on "Multiprotocol
Support" for more information.
Using the freeware Samba server and clients for Linux, you can now
interface quite nicely with TCP/IP-based WfWg or Lan Manager
networks.
Windows 95: Tools are included with Win95 that let you use either the LANMAN
style network drivers (NDIS) or Novell drivers (ODI) to handle your
ARCnet packets. If you use ODI, you'll need to use the 'arc0'
device with Linux. If you use NDIS, then try the 'arc0e' device.
See the "Multiprotocol Support" section below if you need arc0e,
you're completely insane, and/or you need to build some kind of
hybrid network that uses both encapsulation types.
OS/2: I've been told it works under Warp Connect with an ARCnet driver from
SMC. You need to use the 'arc0e' interface for this. If you get
the SMC driver to work with the TCP/IP stuff included in the
"normal" Warp Bonus Pack, let me know.
ftp.microsoft.com also has a freeware "Lan Manager for OS/2" client
which should use the same protocol as WfWg does. I had no luck
installing it under Warp, however. Please mail me with any results.
NetBSD/AmiTCP: These use an old version of the Internet standard ARCnet
protocol (RFC1051) which is compatible with the Linux driver v2.10
ALPHA and above using the arc0s device. (See "Multiprotocol ARCnet"
below.) ** Newer versions of NetBSD apparently support RFC1201.
Using Multiprotocol ARCnet
--------------------------
The ARCnet driver v2.10 ALPHA supports three protocols, each on its own
"virtual network device":
arc0 - RFC1201 protocol, the official Internet standard which just
happens to be 100% compatible with Novell's TRXNET driver.
Version 1.00 of the ARCnet driver supported _only_ this
protocol. arc0 is the fastest of the three protocols (for
whatever reason), and allows larger packets to be used
because it supports RFC1201 "packet splitting" operations.
Unless you have a specific need to use a different protocol,
I strongly suggest that you stick with this one.
arc0e - "Ethernet-Encapsulation" which sends packets over ARCnet
that are actually a lot like Ethernet packets, including the
6-byte hardware addresses. This protocol is compatible with
Microsoft's NDIS ARCnet driver, like the one in WfWg and
LANMAN. Because the MTU of 493 is actually smaller than the
one "required" by TCP/IP (576), there is a chance that some
network operations will not function properly. The Linux
TCP/IP layer can compensate in most cases, however, by
automatically fragmenting the TCP/IP packets to make them
fit. arc0e also works slightly more slowly than arc0, for
reasons yet to be determined. (Probably it's the smaller
MTU that does it.)
arc0s - The "[s]imple" RFC1051 protocol is the "previous" Internet
standard that is completely incompatible with the new
standard. Some software today, however, continues to
support the old standard (and only the old standard)
including NetBSD and AmiTCP. RFC1051 also does not support
RFC1201's packet splitting, and the MTU of 507 is still
smaller than the Internet "requirement," so it's quite
possible that you may run into problems. It's also slower
than RFC1201 by about 25%, for the same reason as arc0e.
The arc0s support was contributed by Tomasz Motylewski
and modified somewhat by me. Bugs are probably my fault.
You can choose not to compile arc0e and arc0s into the driver if you want -
this will save you a bit of memory and avoid confusion when eg. trying to
use the "NFS-root" stuff in recent Linux kernels.
The arc0e and arc0s devices are created automatically when you first
ifconfig the arc0 device. To actually use them, though, you need to also
ifconfig the other virtual devices you need. There are a number of ways you
can set up your network then:
1. Single Protocol.
This is the simplest way to configure your network: use just one of the
two available protocols. As mentioned above, it's a good idea to use
only arc0 unless you have a good reason (like some other software, ie.
WfWg, that only works with arc0e).
If you need only arc0, then the following commands should get you going:
ifconfig arc0 MY.IP.ADD.RESS
route add MY.IP.ADD.RESS arc0
route add -net SUB.NET.ADD.RESS arc0
[add other local routes here]
If you need arc0e (and only arc0e), it's a little different:
ifconfig arc0 MY.IP.ADD.RESS
ifconfig arc0e MY.IP.ADD.RESS
route add MY.IP.ADD.RESS arc0e
route add -net SUB.NET.ADD.RESS arc0e
arc0s works much the same way as arc0e.
2. More than one protocol on the same wire.
Now things start getting confusing. To even try it, you may need to be
partly crazy. Here's what *I* did. :) Note that I don't include arc0s in
my home network; I don't have any NetBSD or AmiTCP computers, so I only
use arc0s during limited testing.
I have three computers on my home network; two Linux boxes (which prefer
RFC1201 protocol, for reasons listed above), and one XT that can't run
Linux but runs the free Microsoft LANMAN Client instead.
Worse, one of the Linux computers (freedom) also has a modem and acts as
a router to my Internet provider. The other Linux box (insight) also has
its own IP address and needs to use freedom as its default gateway. The
XT (patience), however, does not have its own Internet IP address and so
I assigned it one on a "private subnet" (as defined by RFC1597).
To start with, take a simple network with just insight and freedom.
Insight needs to:
- talk to freedom via RFC1201 (arc0) protocol, because I like it
more and it's faster.
- use freedom as its Internet gateway.
That's pretty easy to do. Set up insight like this:
ifconfig arc0 insight
route add insight arc0
route add freedom arc0 /* I would use the subnet here (like I said
to to in "single protocol" above),
but the rest of the subnet
unfortunately lies across the PPP
link on freedom, which confuses
things. */
route add default gw freedom
And freedom gets configured like so:
ifconfig arc0 freedom
route add freedom arc0
route add insight arc0
/* and default gateway is configured by pppd */
Great, now insight talks to freedom directly on arc0, and sends packets
to the Internet through freedom. If you didn't know how to do the above,
you should probably stop reading this section now because it only gets
worse.
Now, how do I add patience into the network? It will be using LANMAN
Client, which means I need the arc0e device. It needs to be able to talk
to both insight and freedom, and also use freedom as a gateway to the
Internet. (Recall that patience has a "private IP address" which won't
work on the Internet; that's okay, I configured Linux IP masquerading on
freedom for this subnet).
So patience (necessarily; I don't have another IP number from my
provider) has an IP address on a different subnet than freedom and
insight, but needs to use freedom as an Internet gateway. Worse, most
DOS networking programs, including LANMAN, have braindead networking
schemes that rely completely on the netmask and a 'default gateway' to
determine how to route packets. This means that to get to freedom or
insight, patience WILL send through its default gateway, regardless of
the fact that both freedom and insight (courtesy of the arc0e device)
could understand a direct transmission.
I compensate by giving freedom an extra IP address - aliased 'gatekeeper'
- that is on my private subnet, the same subnet that patience is on. I
then define gatekeeper to be the default gateway for patience.
To configure freedom (in addition to the commands above):
ifconfig arc0e gatekeeper
route add gatekeeper arc0e
route add patience arc0e
This way, freedom will send all packets for patience through arc0e,
giving its IP address as gatekeeper (on the private subnet). When it
talks to insight or the Internet, it will use its "freedom" Internet IP
address.
You will notice that we haven't configured the arc0e device on insight.
This would work, but is not really necessary, and would require me to
assign insight another special IP number from my private subnet. Since
both insight and patience are using freedom as their default gateway, the
two can already talk to each other.
It's quite fortunate that I set things up like this the first time (cough
cough) because it's really handy when I boot insight into DOS. There, it
runs the Novell ODI protocol stack, which only works with RFC1201 ARCnet.
In this mode it would be impossible for insight to communicate directly
with patience, since the Novell stack is incompatible with Microsoft's
Ethernet-Encap. Without changing any settings on freedom or patience, I
simply set freedom as the default gateway for insight (now in DOS,
remember) and all the forwarding happens "automagically" between the two
hosts that would normally not be able to communicate at all.
For those who like diagrams, I have created two "virtual subnets" on the
same physical ARCnet wire. You can picture it like this:
[RFC1201 NETWORK] [ETHER-ENCAP NETWORK]
(registered Internet subnet) (RFC1597 private subnet)
(IP Masquerade)
/---------------\ * /---------------\
| | * | |
| +-Freedom-*-Gatekeeper-+ |
| | | * | |
\-------+-------/ | * \-------+-------/
| | |
Insight | Patience
(Internet)
It works: what now?
-------------------
Send mail describing your setup, preferably including driver version, kernel
version, ARCnet card model, CPU type, number of systems on your network, and
list of software in use to me at the following address:
apenwarr@worldvisions.ca
I do send (sometimes automated) replies to all messages I receive. My email
can be weird (and also usually gets forwarded all over the place along the
way to me), so if you don't get a reply within a reasonable time, please
resend.
It doesn't work: what now?
--------------------------
Do the same as above, but also include the output of the ifconfig and route
commands, as well as any pertinent log entries (ie. anything that starts
with "arcnet:" and has shown up since the last reboot) in your mail.
If you want to try fixing it yourself (I strongly recommend that you mail me
about the problem first, since it might already have been solved) you may
want to try some of the debug levels available. For heavy testing on
D_DURING or more, it would be a REALLY good idea to kill your klogd daemon
first! D_DURING displays 4-5 lines for each packet sent or received. D_TX,
D_RX, and D_SKB actually DISPLAY each packet as it is sent or received,
which is obviously quite big.
Starting with v2.40 ALPHA, the autoprobe routines have changed
significantly. In particular, they won't tell you why the card was not
found unless you turn on the D_INIT_REASONS debugging flag.
Once the driver is running, you can run the arcdump shell script (available
from me or in the full ARCnet package, if you have it) as root to list the
contents of the arcnet buffers at any time. To make any sense at all out of
this, you should grab the pertinent RFCs. (some are listed near the top of
arcnet.c). arcdump assumes your card is at 0xD0000. If it isn't, edit the
script.
Buffers 0 and 1 are used for receiving, and Buffers 2 and 3 are for sending.
Ping-pong buffers are implemented both ways.
If your debug level includes D_DURING and you did NOT define SLOW_XMIT_COPY,
the buffers are cleared to a constant value of 0x42 every time the card is
reset (which should only happen when you do an ifconfig up, or when Linux
decides that the driver is broken). During a transmit, unused parts of the
buffer will be cleared to 0x42 as well. This is to make it easier to figure
out which bytes are being used by a packet.
You can change the debug level without recompiling the kernel by typing:
ifconfig arc0 down metric 1xxx
/etc/rc.d/rc.inet1
where "xxx" is the debug level you want. For example, "metric 1015" would put
you at debug level 15. Debug level 7 is currently the default.
Note that the debug level is (starting with v1.90 ALPHA) a binary
combination of different debug flags; so debug level 7 is really 1+2+4 or
D_NORMAL+D_EXTRA+D_INIT. To include D_DURING, you would add 16 to this,
resulting in debug level 23.
If you don't understand that, you probably don't want to know anyway.
E-mail me about your problem.
I want to send money: what now?
-------------------------------
Go take a nap or something. You'll feel better in the morning.

View file

@ -0,0 +1,8 @@
In order to use anything but the most primitive functions of ATM,
several user-mode programs are required to assist the kernel. These
programs and related material can be found via the ATM on Linux Web
page at http://linux-atm.sourceforge.net/
If you encounter problems with ATM, please report them on the ATM
on Linux mailing list. Subscription information, archives, etc.,
can be found on http://linux-atm.sourceforge.net/

View file

@ -0,0 +1,10 @@
To use the amateur radio protocols within Linux you will need to get a
suitable copy of the AX.25 Utilities. More detailed information about
AX.25, NET/ROM and ROSE, associated programs and and utilities can be
found on http://www.linux-ax25.org.
There is an active mailing list for discussing Linux amateur radio matters
called linux-hams@vger.kernel.org. To subscribe to it, send a message to
majordomo@vger.kernel.org with the words "subscribe linux-hams" in the body
of the message, the subject field is ignored. You don't need to be
subscribed to post but of course that means you might miss an answer.

View file

@ -0,0 +1,202 @@
BATMAN-ADV
----------
Batman advanced is a new approach to wireless networking which
does no longer operate on the IP basis. Unlike the batman daemon,
which exchanges information using UDP packets and sets routing
tables, batman-advanced operates on ISO/OSI Layer 2 only and uses
and routes (or better: bridges) Ethernet Frames. It emulates a
virtual network switch of all nodes participating. Therefore all
nodes appear to be link local, thus all higher operating proto-
cols won't be affected by any changes within the network. You can
run almost any protocol above batman advanced, prominent examples
are: IPv4, IPv6, DHCP, IPX.
Batman advanced was implemented as a Linux kernel driver to re-
duce the overhead to a minimum. It does not depend on any (other)
network driver, and can be used on wifi as well as ethernet lan,
vpn, etc ... (anything with ethernet-style layer 2).
CONFIGURATION
-------------
Load the batman-adv module into your kernel:
# insmod batman-adv.ko
The module is now waiting for activation. You must add some in-
terfaces on which batman can operate. After loading the module
batman advanced will scan your systems interfaces to search for
compatible interfaces. Once found, it will create subfolders in
the /sys directories of each supported interface, e.g.
# ls /sys/class/net/eth0/batman_adv/
# iface_status mesh_iface
If an interface does not have the "batman_adv" subfolder it prob-
ably is not supported. Not supported interfaces are: loopback,
non-ethernet and batman's own interfaces.
Note: After the module was loaded it will continuously watch for
new interfaces to verify the compatibility. There is no need to
reload the module if you plug your USB wifi adapter into your ma-
chine after batman advanced was initially loaded.
To activate a given interface simply write "bat0" into its
"mesh_iface" file inside the batman_adv subfolder:
# echo bat0 > /sys/class/net/eth0/batman_adv/mesh_iface
Repeat this step for all interfaces you wish to add. Now batman
starts using/broadcasting on this/these interface(s).
By reading the "iface_status" file you can check its status:
# cat /sys/class/net/eth0/batman_adv/iface_status
# active
To deactivate an interface you have to write "none" into its
"mesh_iface" file:
# echo none > /sys/class/net/eth0/batman_adv/mesh_iface
All mesh wide settings can be found in batman's own interface
folder:
# ls /sys/class/net/bat0/mesh/
#aggregated_ogms distributed_arp_table gw_sel_class orig_interval
#ap_isolation fragmentation hop_penalty routing_algo
#bonding gw_bandwidth isolation_mark vlan0
#bridge_loop_avoidance gw_mode log_level
There is a special folder for debugging information:
# ls /sys/kernel/debug/batman_adv/bat0/
# bla_backbone_table log transtable_global
# bla_claim_table originators transtable_local
# gateways socket
Some of the files contain all sort of status information regard-
ing the mesh network. For example, you can view the table of
originators (mesh participants) with:
# cat /sys/kernel/debug/batman_adv/bat0/originators
Other files allow to change batman's behaviour to better fit your
requirements. For instance, you can check the current originator
interval (value in milliseconds which determines how often batman
sends its broadcast packets):
# cat /sys/class/net/bat0/mesh/orig_interval
# 1000
and also change its value:
# echo 3000 > /sys/class/net/bat0/mesh/orig_interval
In very mobile scenarios, you might want to adjust the originator
interval to a lower value. This will make the mesh more respon-
sive to topology changes, but will also increase the overhead.
USAGE
-----
To make use of your newly created mesh, batman advanced provides
a new interface "bat0" which you should use from this point on.
All interfaces added to batman advanced are not relevant any
longer because batman handles them for you. Basically, one "hands
over" the data by using the batman interface and batman will make
sure it reaches its destination.
The "bat0" interface can be used like any other regular inter-
face. It needs an IP address which can be either statically con-
figured or dynamically (by using DHCP or similar services):
# NodeA: ifconfig bat0 192.168.0.1
# NodeB: ifconfig bat0 192.168.0.2
# NodeB: ping 192.168.0.1
Note: In order to avoid problems remove all IP addresses previ-
ously assigned to interfaces now used by batman advanced, e.g.
# ifconfig eth0 0.0.0.0
LOGGING/DEBUGGING
-----------------
All error messages, warnings and information messages are sent to
the kernel log. Depending on your operating system distribution
this can be read in one of a number of ways. Try using the com-
mands: dmesg, logread, or looking in the files /var/log/kern.log
or /var/log/syslog. All batman-adv messages are prefixed with
"batman-adv:" So to see just these messages try
# dmesg | grep batman-adv
When investigating problems with your mesh network it is some-
times necessary to see more detail debug messages. This must be
enabled when compiling the batman-adv module. When building bat-
man-adv as part of kernel, use "make menuconfig" and enable the
option "B.A.T.M.A.N. debugging".
Those additional debug messages can be accessed using a special
file in debugfs
# cat /sys/kernel/debug/batman_adv/bat0/log
The additional debug output is by default disabled. It can be en-
abled during run time. Following log_levels are defined:
0 - All debug output disabled
1 - Enable messages related to routing / flooding / broadcasting
2 - Enable messages related to route added / changed / deleted
4 - Enable messages related to translation table operations
8 - Enable messages related to bridge loop avoidance
16 - Enable messaged related to DAT, ARP snooping and parsing
31 - Enable all messages
The debug output can be changed at runtime using the file
/sys/class/net/bat0/mesh/log_level. e.g.
# echo 6 > /sys/class/net/bat0/mesh/log_level
will enable debug messages for when routes change.
Counters for different types of packets entering and leaving the
batman-adv module are available through ethtool:
# ethtool --statistics bat0
BATCTL
------
As batman advanced operates on layer 2 all hosts participating in
the virtual switch are completely transparent for all protocols
above layer 2. Therefore the common diagnosis tools do not work
as expected. To overcome these problems batctl was created. At
the moment the batctl contains ping, traceroute, tcpdump and
interfaces to the kernel module settings.
For more information, please see the manpage (man batctl).
batctl is available on http://www.open-mesh.org/
CONTACT
-------
Please send us comments, experiences, questions, anything :)
IRC: #batman on irc.freenode.org
Mailing-list: b.a.t.m.a.n@open-mesh.org (optional subscription
at https://lists.open-mesh.org/mm/listinfo/b.a.t.m.a.n)
You can also contact the Authors:
Marek Lindner <mareklindner@neomailbox.ch>
Simon Wunderlich <sw@simonwunderlich.de>

View file

@ -0,0 +1,158 @@
LINUX DRIVERS FOR BAYCOM MODEMS
Thomas M. Sailer, HB9JNX/AE4WA, <sailer@ife.ee.ethz.ch>
!!NEW!! (04/98) The drivers for the baycom modems have been split into
separate drivers as they did not share any code, and the driver
and device names have changed.
This document describes the Linux Kernel Drivers for simple Baycom style
amateur radio modems.
The following drivers are available:
baycom_ser_fdx:
This driver supports the SER12 modems either full or half duplex.
Its baud rate may be changed via the `baud' module parameter,
therefore it supports just about every bit bang modem on a
serial port. Its devices are called bcsf0 through bcsf3.
This is the recommended driver for SER12 type modems,
however if you have a broken UART clone that does not have working
delta status bits, you may try baycom_ser_hdx.
baycom_ser_hdx:
This is an alternative driver for SER12 type modems.
It only supports half duplex, and only 1200 baud. Its devices
are called bcsh0 through bcsh3. Use this driver only if baycom_ser_fdx
does not work with your UART.
baycom_par:
This driver supports the par96 and picpar modems.
Its devices are called bcp0 through bcp3.
baycom_epp:
This driver supports the EPP modem.
Its devices are called bce0 through bce3.
This driver is work-in-progress.
The following modems are supported:
ser12: This is a very simple 1200 baud AFSK modem. The modem consists only
of a modulator/demodulator chip, usually a TI TCM3105. The computer
is responsible for regenerating the receiver bit clock, as well as
for handling the HDLC protocol. The modem connects to a serial port,
hence the name. Since the serial port is not used as an async serial
port, the kernel driver for serial ports cannot be used, and this
driver only supports standard serial hardware (8250, 16450, 16550)
par96: This is a modem for 9600 baud FSK compatible to the G3RUH standard.
The modem does all the filtering and regenerates the receiver clock.
Data is transferred from and to the PC via a shift register.
The shift register is filled with 16 bits and an interrupt is signalled.
The PC then empties the shift register in a burst. This modem connects
to the parallel port, hence the name. The modem leaves the
implementation of the HDLC protocol and the scrambler polynomial to
the PC.
picpar: This is a redesign of the par96 modem by Henning Rech, DF9IC. The modem
is protocol compatible to par96, but uses only three low power ICs
and can therefore be fed from the parallel port and does not require
an additional power supply. Furthermore, it incorporates a carrier
detect circuitry.
EPP: This is a high-speed modem adaptor that connects to an enhanced parallel port.
Its target audience is users working over a high speed hub (76.8kbit/s).
eppfpga: This is a redesign of the EPP adaptor.
All of the above modems only support half duplex communications. However,
the driver supports the KISS (see below) fullduplex command. It then simply
starts to send as soon as there's a packet to transmit and does not care
about DCD, i.e. it starts to send even if there's someone else on the channel.
This command is required by some implementations of the DAMA channel
access protocol.
The Interface of the drivers
Unlike previous drivers, these drivers are no longer character devices,
but they are now true kernel network interfaces. Installation is therefore
simple. Once installed, four interfaces named bc{sf,sh,p,e}[0-3] are available.
sethdlc from the ax25 utilities may be used to set driver states etc.
Users of userland AX.25 stacks may use the net2kiss utility (also available
in the ax25 utilities package) to convert packets of a network interface
to a KISS stream on a pseudo tty. There's also a patch available from
me for WAMPES which allows attaching a kernel network interface directly.
Configuring the driver
Every time a driver is inserted into the kernel, it has to know which
modems it should access at which ports. This can be done with the setbaycom
utility. If you are only using one modem, you can also configure the
driver from the insmod command line (or by means of an option line in
/etc/modprobe.d/*.conf).
Examples:
modprobe baycom_ser_fdx mode="ser12*" iobase=0x3f8 irq=4
sethdlc -i bcsf0 -p mode "ser12*" io 0x3f8 irq 4
Both lines configure the first port to drive a ser12 modem at the first
serial port (COM1 under DOS). The * in the mode parameter instructs the driver to use
the software DCD algorithm (see below).
insmod baycom_par mode="picpar" iobase=0x378
sethdlc -i bcp0 -p mode "picpar" io 0x378
Both lines configure the first port to drive a picpar modem at the
first parallel port (LPT1 under DOS). (Note: picpar implies
hardware DCD, par96 implies software DCD).
The channel access parameters can be set with sethdlc -a or kissparms.
Note that both utilities interpret the values slightly differently.
Hardware DCD versus Software DCD
To avoid collisions on the air, the driver must know when the channel is
busy. This is the task of the DCD circuitry/software. The driver may either
utilise a software DCD algorithm (options=1) or use a DCD signal from
the hardware (options=0).
ser12: if software DCD is utilised, the radio's squelch should always be
open. It is highly recommended to use the software DCD algorithm,
as it is much faster than most hardware squelch circuitry. The
disadvantage is a slightly higher load on the system.
par96: the software DCD algorithm for this type of modem is rather poor.
The modem simply does not provide enough information to implement
a reasonable DCD algorithm in software. Therefore, if your radio
feeds the DCD input of the PAR96 modem, the use of the hardware
DCD circuitry is recommended.
picpar: the picpar modem features a builtin DCD hardware, which is highly
recommended.
Compatibility with the rest of the Linux kernel
The serial driver and the baycom serial drivers compete
for the same hardware resources. Of course only one driver can access a given
interface at a time. The serial driver grabs all interfaces it can find at
startup time. Therefore the baycom drivers subsequently won't be able to
access a serial port. You might therefore find it necessary to release
a port owned by the serial driver with 'setserial /dev/ttyS# uart none', where
# is the number of the interface. The baycom drivers do not reserve any
ports at startup, unless one is specified on the 'insmod' command line. Another
method to solve the problem is to compile all drivers as modules and
leave it to kmod to load the correct driver depending on the application.
The parallel port drivers (baycom_par, baycom_epp) now use the parport subsystem
to arbitrate the ports between different client drivers.
vy 73s de
Tom Sailer, sailer@ife.ee.ethz.ch
hb9jnx @ hb9w.ampr.org

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,15 @@
In order to use the Ethernet bridging functionality, you'll need the
userspace tools.
Documentation for Linux bridging is on:
http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge
The bridge-utilities are maintained at:
git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/bridge-utils.git
Additionally, the iproute2 utilities can be used to configure
bridge devices.
If you still have questions, don't hesitate to post to the mailing list
(more info https://lists.linux-foundation.org/mailman/listinfo/bridge).

View file

@ -0,0 +1,175 @@
Linux CAIF
===========
copyright (C) ST-Ericsson AB 2010
Author: Sjur Brendeland/ sjur.brandeland@stericsson.com
License terms: GNU General Public License (GPL) version 2
Introduction
------------
CAIF is a MUX protocol used by ST-Ericsson cellular modems for
communication between Modem and host. The host processes can open virtual AT
channels, initiate GPRS Data connections, Video channels and Utility Channels.
The Utility Channels are general purpose pipes between modem and host.
ST-Ericsson modems support a number of transports between modem
and host. Currently, UART and Loopback are available for Linux.
Architecture:
------------
The implementation of CAIF is divided into:
* CAIF Socket Layer and GPRS IP Interface.
* CAIF Core Protocol Implementation
* CAIF Link Layer, implemented as NET devices.
RTNL
!
! +------+ +------+
! +------+! +------+!
! ! IP !! !Socket!!
+-------> !interf!+ ! API !+ <- CAIF Client APIs
! +------+ +------!
! ! !
! +-----------+
! !
! +------+ <- CAIF Core Protocol
! ! CAIF !
! ! Core !
! +------+
! +----------!---------+
! ! ! !
! +------+ +-----+ +------+
+--> ! HSI ! ! TTY ! ! USB ! <- Link Layer (Net Devices)
+------+ +-----+ +------+
I M P L E M E N T A T I O N
===========================
CAIF Core Protocol Layer
=========================================
CAIF Core layer implements the CAIF protocol as defined by ST-Ericsson.
It implements the CAIF protocol stack in a layered approach, where
each layer described in the specification is implemented as a separate layer.
The architecture is inspired by the design patterns "Protocol Layer" and
"Protocol Packet".
== CAIF structure ==
The Core CAIF implementation contains:
- Simple implementation of CAIF.
- Layered architecture (a la Streams), each layer in the CAIF
specification is implemented in a separate c-file.
- Clients must call configuration function to add PHY layer.
- Clients must implement CAIF layer to consume/produce
CAIF payload with receive and transmit functions.
- Clients must call configuration function to add and connect the
Client layer.
- When receiving / transmitting CAIF Packets (cfpkt), ownership is passed
to the called function (except for framing layers' receive function)
Layered Architecture
--------------------
The CAIF protocol can be divided into two parts: Support functions and Protocol
Implementation. The support functions include:
- CFPKT CAIF Packet. Implementation of CAIF Protocol Packet. The
CAIF Packet has functions for creating, destroying and adding content
and for adding/extracting header and trailers to protocol packets.
The CAIF Protocol implementation contains:
- CFCNFG CAIF Configuration layer. Configures the CAIF Protocol
Stack and provides a Client interface for adding Link-Layer and
Driver interfaces on top of the CAIF Stack.
- CFCTRL CAIF Control layer. Encodes and Decodes control messages
such as enumeration and channel setup. Also matches request and
response messages.
- CFSERVL General CAIF Service Layer functionality; handles flow
control and remote shutdown requests.
- CFVEI CAIF VEI layer. Handles CAIF AT Channels on VEI (Virtual
External Interface). This layer encodes/decodes VEI frames.
- CFDGML CAIF Datagram layer. Handles CAIF Datagram layer (IP
traffic), encodes/decodes Datagram frames.
- CFMUX CAIF Mux layer. Handles multiplexing between multiple
physical bearers and multiple channels such as VEI, Datagram, etc.
The MUX keeps track of the existing CAIF Channels and
Physical Instances and selects the appropriate instance based
on Channel-Id and Physical-ID.
- CFFRML CAIF Framing layer. Handles Framing i.e. Frame length
and frame checksum.
- CFSERL CAIF Serial layer. Handles concatenation/split of frames
into CAIF Frames with correct length.
+---------+
| Config |
| CFCNFG |
+---------+
!
+---------+ +---------+ +---------+
| AT | | Control | | Datagram|
| CFVEIL | | CFCTRL | | CFDGML |
+---------+ +---------+ +---------+
\_____________!______________/
!
+---------+
| MUX |
| |
+---------+
_____!_____
/ \
+---------+ +---------+
| CFFRML | | CFFRML |
| Framing | | Framing |
+---------+ +---------+
! !
+---------+ +---------+
| | | Serial |
| | | CFSERL |
+---------+ +---------+
In this layered approach the following "rules" apply.
- All layers embed the same structure "struct cflayer"
- A layer does not depend on any other layer's private data.
- Layers are stacked by setting the pointers
layer->up , layer->dn
- In order to send data upwards, each layer should do
layer->up->receive(layer->up, packet);
- In order to send data downwards, each layer should do
layer->dn->transmit(layer->dn, packet);
CAIF Socket and IP interface
===========================
The IP interface and CAIF socket API are implemented on top of the
CAIF Core protocol. The IP Interface and CAIF socket have an instance of
'struct cflayer', just like the CAIF Core protocol stack.
Net device and Socket implement the 'receive()' function defined by
'struct cflayer', just like the rest of the CAIF stack. In this way, transmit and
receive of packets is handled as by the rest of the layers: the 'dn->transmit()'
function is called in order to transmit data.
Configuration of Link Layer
---------------------------
The Link Layer is implemented as Linux network devices (struct net_device).
Payload handling and registration is done using standard Linux mechanisms.
The CAIF Protocol relies on a loss-less link layer without implementing
retransmission. This implies that packet drops must not happen.
Therefore a flow-control mechanism is implemented where the physical
interface can initiate flow stop for all CAIF Channels.

View file

@ -0,0 +1,109 @@
Copyright (C) ST-Ericsson AB 2010
Author: Sjur Brendeland/ sjur.brandeland@stericsson.com
License terms: GNU General Public License (GPL) version 2
---------------------------------------------------------
=== Start ===
If you have compiled CAIF for modules do:
$modprobe crc_ccitt
$modprobe caif
$modprobe caif_socket
$modprobe chnl_net
=== Preparing the setup with a STE modem ===
If you are working on integration of CAIF you should make sure
that the kernel is built with module support.
There are some things that need to be tweaked to get the host TTY correctly
set up to talk to the modem.
Since the CAIF stack is running in the kernel and we want to use the existing
TTY, we are installing our physical serial driver as a line discipline above
the TTY device.
To achieve this we need to install the N_CAIF ldisc from user space.
The benefit is that we can hook up to any TTY.
The use of Start-of-frame-extension (STX) must also be set as
module parameter "ser_use_stx".
Normally Frame Checksum is always used on UART, but this is also provided as a
module parameter "ser_use_fcs".
$ modprobe caif_serial ser_ttyname=/dev/ttyS0 ser_use_stx=yes
$ ifconfig caif_ttyS0 up
PLEASE NOTE: There is a limitation in Android shell.
It only accepts one argument to insmod/modprobe!
=== Trouble shooting ===
There are debugfs parameters provided for serial communication.
/sys/kernel/debug/caif_serial/<tty-name>/
* ser_state: Prints the bit-mask status where
- 0x02 means SENDING, this is a transient state.
- 0x10 means FLOW_OFF_SENT, i.e. the previous frame has not been sent
and is blocking further send operation. Flow OFF has been propagated
to all CAIF Channels using this TTY.
* tty_status: Prints the bit-mask tty status information
- 0x01 - tty->warned is on.
- 0x02 - tty->low_latency is on.
- 0x04 - tty->packed is on.
- 0x08 - tty->flow_stopped is on.
- 0x10 - tty->hw_stopped is on.
- 0x20 - tty->stopped is on.
* last_tx_msg: Binary blob Prints the last transmitted frame.
This can be printed with
$od --format=x1 /sys/kernel/debug/caif_serial/<tty>/last_rx_msg.
The first two tx messages sent look like this. Note: The initial
byte 02 is start of frame extension (STX) used for re-syncing
upon errors.
- Enumeration:
0000000 02 05 00 00 03 01 d2 02
| | | | | |
STX(1) | | | |
Length(2)| | |
Control Channel(1)
Command:Enumeration(1)
Link-ID(1)
Checksum(2)
- Channel Setup:
0000000 02 07 00 00 00 21 a1 00 48 df
| | | | | | | |
STX(1) | | | | | |
Length(2)| | | | |
Control Channel(1)
Command:Channel Setup(1)
Channel Type(1)
Priority and Link-ID(1)
Endpoint(1)
Checksum(2)
* last_rx_msg: Prints the last transmitted frame.
The RX messages for LinkSetup look almost identical but they have the
bit 0x20 set in the command bit, and Channel Setup has added one byte
before Checksum containing Channel ID.
NOTE: Several CAIF Messages might be concatenated. The maximum debug
buffer size is 128 bytes.
== Error Scenarios:
- last_tx_msg contains channel setup message and last_rx_msg is empty ->
The host seems to be able to send over the UART, at least the CAIF ldisc get
notified that sending is completed.
- last_tx_msg contains enumeration message and last_rx_msg is empty ->
The host is not able to send the message from UART, the tty has not been
able to complete the transmit operation.
- if /sys/kernel/debug/caif_serial/<tty>/tty_status is non-zero there
might be problems transmitting over UART.
E.g. host and modem wiring is not correct you will typically see
tty_status = 0x10 (hw_stopped) and ser_state = 0x10 (FLOW_OFF_SENT).
You will probably see the enumeration message in last_tx_message
and empty last_rx_message.

View file

@ -0,0 +1,208 @@
- CAIF SPI porting -
- CAIF SPI basics:
Running CAIF over SPI needs some extra setup, owing to the nature of SPI.
Two extra GPIOs have been added in order to negotiate the transfers
between the master and the slave. The minimum requirement for running
CAIF over SPI is a SPI slave chip and two GPIOs (more details below).
Please note that running as a slave implies that you need to keep up
with the master clock. An overrun or underrun event is fatal.
- CAIF SPI framework:
To make porting as easy as possible, the CAIF SPI has been divided in
two parts. The first part (called the interface part) deals with all
generic functionality such as length framing, SPI frame negotiation
and SPI frame delivery and transmission. The other part is the CAIF
SPI slave device part, which is the module that you have to write if
you want to run SPI CAIF on a new hardware. This part takes care of
the physical hardware, both with regard to SPI and to GPIOs.
- Implementing a CAIF SPI device:
- Functionality provided by the CAIF SPI slave device:
In order to implement a SPI device you will, as a minimum,
need to implement the following
functions:
int (*init_xfer) (struct cfspi_xfer * xfer, struct cfspi_dev *dev):
This function is called by the CAIF SPI interface to give
you a chance to set up your hardware to be ready to receive
a stream of data from the master. The xfer structure contains
both physical and logical addresses, as well as the total length
of the transfer in both directions.The dev parameter can be used
to map to different CAIF SPI slave devices.
void (*sig_xfer) (bool xfer, struct cfspi_dev *dev):
This function is called by the CAIF SPI interface when the output
(SPI_INT) GPIO needs to change state. The boolean value of the xfer
variable indicates whether the GPIO should be asserted (HIGH) or
deasserted (LOW). The dev parameter can be used to map to different CAIF
SPI slave devices.
- Functionality provided by the CAIF SPI interface:
void (*ss_cb) (bool assert, struct cfspi_ifc *ifc);
This function is called by the CAIF SPI slave device in order to
signal a change of state of the input GPIO (SS) to the interface.
Only active edges are mandatory to be reported.
This function can be called from IRQ context (recommended in order
not to introduce latency). The ifc parameter should be the pointer
returned from the platform probe function in the SPI device structure.
void (*xfer_done_cb) (struct cfspi_ifc *ifc);
This function is called by the CAIF SPI slave device in order to
report that a transfer is completed. This function should only be
called once both the transmission and the reception are completed.
This function can be called from IRQ context (recommended in order
not to introduce latency). The ifc parameter should be the pointer
returned from the platform probe function in the SPI device structure.
- Connecting the bits and pieces:
- Filling in the SPI slave device structure:
Connect the necessary callback functions.
Indicate clock speed (used to calculate toggle delays).
Chose a suitable name (helps debugging if you use several CAIF
SPI slave devices).
Assign your private data (can be used to map to your structure).
- Filling in the SPI slave platform device structure:
Add name of driver to connect to ("cfspi_sspi").
Assign the SPI slave device structure as platform data.
- Padding:
In order to optimize throughput, a number of SPI padding options are provided.
Padding can be enabled independently for uplink and downlink transfers.
Padding can be enabled for the head, the tail and for the total frame size.
The padding needs to be correctly configured on both sides of the link.
The padding can be changed via module parameters in cfspi_sspi.c or via
the sysfs directory of the cfspi_sspi driver (before device registration).
- CAIF SPI device template:
/*
* Copyright (C) ST-Ericsson AB 2010
* Author: Daniel Martensson / Daniel.Martensson@stericsson.com
* License terms: GNU General Public License (GPL), version 2.
*
*/
#include <linux/init.h>
#include <linux/module.h>
#include <linux/device.h>
#include <linux/wait.h>
#include <linux/interrupt.h>
#include <linux/dma-mapping.h>
#include <net/caif/caif_spi.h>
MODULE_LICENSE("GPL");
struct sspi_struct {
struct cfspi_dev sdev;
struct cfspi_xfer *xfer;
};
static struct sspi_struct slave;
static struct platform_device slave_device;
static irqreturn_t sspi_irq(int irq, void *arg)
{
/* You only need to trigger on an edge to the active state of the
* SS signal. Once a edge is detected, the ss_cb() function should be
* called with the parameter assert set to true. It is OK
* (and even advised) to call the ss_cb() function in IRQ context in
* order not to add any delay. */
return IRQ_HANDLED;
}
static void sspi_complete(void *context)
{
/* Normally the DMA or the SPI framework will call you back
* in something similar to this. The only thing you need to
* do is to call the xfer_done_cb() function, providing the pointer
* to the CAIF SPI interface. It is OK to call this function
* from IRQ context. */
}
static int sspi_init_xfer(struct cfspi_xfer *xfer, struct cfspi_dev *dev)
{
/* Store transfer info. For a normal implementation you should
* set up your DMA here and make sure that you are ready to
* receive the data from the master SPI. */
struct sspi_struct *sspi = (struct sspi_struct *)dev->priv;
sspi->xfer = xfer;
return 0;
}
void sspi_sig_xfer(bool xfer, struct cfspi_dev *dev)
{
/* If xfer is true then you should assert the SPI_INT to indicate to
* the master that you are ready to receive the data from the master
* SPI. If xfer is false then you should de-assert SPI_INT to indicate
* that the transfer is done.
*/
struct sspi_struct *sspi = (struct sspi_struct *)dev->priv;
}
static void sspi_release(struct device *dev)
{
/*
* Here you should release your SPI device resources.
*/
}
static int __init sspi_init(void)
{
/* Here you should initialize your SPI device by providing the
* necessary functions, clock speed, name and private data. Once
* done, you can register your device with the
* platform_device_register() function. This function will return
* with the CAIF SPI interface initialized. This is probably also
* the place where you should set up your GPIOs, interrupts and SPI
* resources. */
int res = 0;
/* Initialize slave device. */
slave.sdev.init_xfer = sspi_init_xfer;
slave.sdev.sig_xfer = sspi_sig_xfer;
slave.sdev.clk_mhz = 13;
slave.sdev.priv = &slave;
slave.sdev.name = "spi_sspi";
slave_device.dev.release = sspi_release;
/* Initialize platform device. */
slave_device.name = "cfspi_sspi";
slave_device.dev.platform_data = &slave.sdev;
/* Register platform device. */
res = platform_device_register(&slave_device);
if (res) {
printk(KERN_WARNING "sspi_init: failed to register dev.\n");
return -ENODEV;
}
return res;
}
static void __exit sspi_exit(void)
{
platform_device_del(&slave_device);
}
module_init(sspi_init);
module_exit(sspi_exit);

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,339 @@
cdc_mbim - Driver for CDC MBIM Mobile Broadband modems
========================================================
The cdc_mbim driver supports USB devices conforming to the "Universal
Serial Bus Communications Class Subclass Specification for Mobile
Broadband Interface Model" [1], which is a further development of
"Universal Serial Bus Communications Class Subclass Specifications for
Network Control Model Devices" [2] optimized for Mobile Broadband
devices, aka "3G/LTE modems".
Command Line Parameters
=======================
The cdc_mbim driver has no parameters of its own. But the probing
behaviour for NCM 1.0 backwards compatible MBIM functions (an
"NCM/MBIM function" as defined in section 3.2 of [1]) is affected
by a cdc_ncm driver parameter:
prefer_mbim
-----------
Type: Boolean
Valid Range: N/Y (0-1)
Default Value: Y (MBIM is preferred)
This parameter sets the system policy for NCM/MBIM functions. Such
functions will be handled by either the cdc_ncm driver or the cdc_mbim
driver depending on the prefer_mbim setting. Setting prefer_mbim=N
makes the cdc_mbim driver ignore these functions and lets the cdc_ncm
driver handle them instead.
The parameter is writable, and can be changed at any time. A manual
unbind/bind is required to make the change effective for NCM/MBIM
functions bound to the "wrong" driver
Basic usage
===========
MBIM functions are inactive when unmanaged. The cdc_mbim driver only
provides an userspace interface to the MBIM control channel, and will
not participate in the management of the function. This implies that a
userspace MBIM management application always is required to enable a
MBIM function.
Such userspace applications includes, but are not limited to:
- mbimcli (included with the libmbim [3] library), and
- ModemManager [4]
Establishing a MBIM IP session reequires at least these actions by the
management application:
- open the control channel
- configure network connection settings
- connect to network
- configure IP interface
Management application development
----------------------------------
The driver <-> userspace interfaces are described below. The MBIM
control channel protocol is described in [1].
MBIM control channel userspace ABI
==================================
/dev/cdc-wdmX character device
------------------------------
The driver creates a two-way pipe to the MBIM function control channel
using the cdc-wdm driver as a subdriver. The userspace end of the
control channel pipe is a /dev/cdc-wdmX character device.
The cdc_mbim driver does not process or police messages on the control
channel. The channel is fully delegated to the userspace management
application. It is therefore up to this application to ensure that it
complies with all the control channel requirements in [1].
The cdc-wdmX device is created as a child of the MBIM control
interface USB device. The character device associated with a specific
MBIM function can be looked up using sysfs. For example:
bjorn@nemi:~$ ls /sys/bus/usb/drivers/cdc_mbim/2-4:2.12/usbmisc
cdc-wdm0
bjorn@nemi:~$ grep . /sys/bus/usb/drivers/cdc_mbim/2-4:2.12/usbmisc/cdc-wdm0/dev
180:0
USB configuration descriptors
-----------------------------
The wMaxControlMessage field of the CDC MBIM functional descriptor
limits the maximum control message size. The managament application is
responsible for negotiating a control message size complying with the
requirements in section 9.3.1 of [1], taking this descriptor field
into consideration.
The userspace application can access the CDC MBIM functional
descriptor of a MBIM function using either of the two USB
configuration descriptor kernel interfaces described in [6] or [7].
See also the ioctl documentation below.
Fragmentation
-------------
The userspace application is responsible for all control message
fragmentation and defragmentaion, as described in section 9.5 of [1].
/dev/cdc-wdmX write()
---------------------
The MBIM control messages from the management application *must not*
exceed the negotiated control message size.
/dev/cdc-wdmX read()
--------------------
The management application *must* accept control messages of up the
negotiated control message size.
/dev/cdc-wdmX ioctl()
--------------------
IOCTL_WDM_MAX_COMMAND: Get Maximum Command Size
This ioctl returns the wMaxControlMessage field of the CDC MBIM
functional descriptor for MBIM devices. This is intended as a
convenience, eliminating the need to parse the USB descriptors from
userspace.
#include <stdio.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <linux/types.h>
#include <linux/usb/cdc-wdm.h>
int main()
{
__u16 max;
int fd = open("/dev/cdc-wdm0", O_RDWR);
if (!ioctl(fd, IOCTL_WDM_MAX_COMMAND, &max))
printf("wMaxControlMessage is %d\n", max);
}
Custom device services
----------------------
The MBIM specification allows vendors to freely define additional
services. This is fully supported by the cdc_mbim driver.
Support for new MBIM services, including vendor specified services, is
implemented entirely in userspace, like the rest of the MBIM control
protocol
New services should be registered in the MBIM Registry [5].
MBIM data channel userspace ABI
===============================
wwanY network device
--------------------
The cdc_mbim driver represents the MBIM data channel as a single
network device of the "wwan" type. This network device is initially
mapped to MBIM IP session 0.
Multiplexed IP sessions (IPS)
-----------------------------
MBIM allows multiplexing up to 256 IP sessions over a single USB data
channel. The cdc_mbim driver models such IP sessions as 802.1q VLAN
subdevices of the master wwanY device, mapping MBIM IP session Z to
VLAN ID Z for all values of Z greater than 0.
The device maximum Z is given in the MBIM_DEVICE_CAPS_INFO structure
described in section 10.5.1 of [1].
The userspace management application is responsible for adding new
VLAN links prior to establishing MBIM IP sessions where the SessionId
is greater than 0. These links can be added by using the normal VLAN
kernel interfaces, either ioctl or netlink.
For example, adding a link for a MBIM IP session with SessionId 3:
ip link add link wwan0 name wwan0.3 type vlan id 3
The driver will automatically map the "wwan0.3" network device to MBIM
IP session 3.
Device Service Streams (DSS)
----------------------------
MBIM also allows up to 256 non-IP data streams to be multiplexed over
the same shared USB data channel. The cdc_mbim driver models these
sessions as another set of 802.1q VLAN subdevices of the master wwanY
device, mapping MBIM DSS session A to VLAN ID (256 + A) for all values
of A.
The device maximum A is given in the MBIM_DEVICE_SERVICES_INFO
structure described in section 10.5.29 of [1].
The DSS VLAN subdevices are used as a practical interface between the
shared MBIM data channel and a MBIM DSS aware userspace application.
It is not intended to be presented as-is to an end user. The
assumption is that an userspace application initiating a DSS session
also takes care of the necessary framing of the DSS data, presenting
the stream to the end user in an appropriate way for the stream type.
The network device ABI requires a dummy ethernet header for every DSS
data frame being transported. The contents of this header is
arbitrary, with the following exceptions:
- TX frames using an IP protocol (0x0800 or 0x86dd) will be dropped
- RX frames will have the protocol field set to ETH_P_802_3 (but will
not be properly formatted 802.3 frames)
- RX frames will have the destination address set to the hardware
address of the master device
The DSS supporting userspace management application is responsible for
adding the dummy ethernet header on TX and stripping it on RX.
This is a simple example using tools commonly available, exporting
DssSessionId 5 as a pty character device pointed to by a /dev/nmea
symlink:
ip link add link wwan0 name wwan0.dss5 type vlan id 261
ip link set dev wwan0.dss5 up
socat INTERFACE:wwan0.dss5,type=2 PTY:,echo=0,link=/dev/nmea
This is only an example, most suitable for testing out a DSS
service. Userspace applications supporting specific MBIM DSS services
are expected to use the tools and programming interfaces required by
that service.
Note that adding VLAN links for DSS sessions is entirely optional. A
management application may instead choose to bind a packet socket
directly to the master network device, using the received VLAN tags to
map frames to the correct DSS session and adding 18 byte VLAN ethernet
headers with the appropriate tag on TX. In this case using a socket
filter is recommended, matching only the DSS VLAN subset. This avoid
unnecessary copying of unrelated IP session data to userspace. For
example:
static struct sock_filter dssfilter[] = {
/* use special negative offsets to get VLAN tag */
BPF_STMT(BPF_LD|BPF_B|BPF_ABS, SKF_AD_OFF + SKF_AD_VLAN_TAG_PRESENT),
BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, 1, 0, 6), /* true */
/* verify DSS VLAN range */
BPF_STMT(BPF_LD|BPF_H|BPF_ABS, SKF_AD_OFF + SKF_AD_VLAN_TAG),
BPF_JUMP(BPF_JMP|BPF_JGE|BPF_K, 256, 0, 4), /* 256 is first DSS VLAN */
BPF_JUMP(BPF_JMP|BPF_JGE|BPF_K, 512, 3, 0), /* 511 is last DSS VLAN */
/* verify ethertype */
BPF_STMT(BPF_LD|BPF_H|BPF_ABS, 2 * ETH_ALEN),
BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, ETH_P_802_3, 0, 1),
BPF_STMT(BPF_RET|BPF_K, (u_int)-1), /* accept */
BPF_STMT(BPF_RET|BPF_K, 0), /* ignore */
};
Tagged IP session 0 VLAN
------------------------
As described above, MBIM IP session 0 is treated as special by the
driver. It is initially mapped to untagged frames on the wwanY
network device.
This mapping implies a few restrictions on multiplexed IPS and DSS
sessions, which may not always be practical:
- no IPS or DSS session can use a frame size greater than the MTU on
IP session 0
- no IPS or DSS session can be in the up state unless the network
device representing IP session 0 also is up
These problems can be avoided by optionally making the driver map IP
session 0 to a VLAN subdevice, similar to all other IP sessions. This
behaviour is triggered by adding a VLAN link for the magic VLAN ID
4094. The driver will then immediately start mapping MBIM IP session
0 to this VLAN, and will drop untagged frames on the master wwanY
device.
Tip: It might be less confusing to the end user to name this VLAN
subdevice after the MBIM SessionID instead of the VLAN ID. For
example:
ip link add link wwan0 name wwan0.0 type vlan id 4094
VLAN mapping
------------
Summarizing the cdc_mbim driver mapping described above, we have this
relationship between VLAN tags on the wwanY network device and MBIM
sessions on the shared USB data channel:
VLAN ID MBIM type MBIM SessionID Notes
---------------------------------------------------------
untagged IPS 0 a)
1 - 255 IPS 1 - 255 <VLANID>
256 - 511 DSS 0 - 255 <VLANID - 256>
512 - 4093 b)
4094 IPS 0 c)
a) if no VLAN ID 4094 link exists, else dropped
b) unsupported VLAN range, unconditionally dropped
c) if a VLAN ID 4094 link exists, else dropped
References
==========
[1] USB Implementers Forum, Inc. - "Universal Serial Bus
Communications Class Subclass Specification for Mobile Broadband
Interface Model", Revision 1.0 (Errata 1), May 1, 2013
- http://www.usb.org/developers/docs/devclass_docs/
[2] USB Implementers Forum, Inc. - "Universal Serial Bus
Communications Class Subclass Specifications for Network Control
Model Devices", Revision 1.0 (Errata 1), November 24, 2010
- http://www.usb.org/developers/docs/devclass_docs/
[3] libmbim - "a glib-based library for talking to WWAN modems and
devices which speak the Mobile Interface Broadband Model (MBIM)
protocol"
- http://www.freedesktop.org/wiki/Software/libmbim/
[4] ModemManager - "a DBus-activated daemon which controls mobile
broadband (2G/3G/4G) devices and connections"
- http://www.freedesktop.org/wiki/Software/ModemManager/
[5] "MBIM (Mobile Broadband Interface Model) Registry"
- http://compliance.usb.org/mbim/
[6] "/proc/bus/usb filesystem output"
- Documentation/usb/proc_usb_info.txt
[7] "/sys/bus/usb/devices/.../descriptors"
- Documentation/ABI/stable/sysfs-bus-usb

View file

@ -0,0 +1,63 @@
Text File for the COPS LocalTalk Linux driver (cops.c).
By Jay Schulist <jschlst@samba.org>
This driver has two modes and they are: Dayna mode and Tangent mode.
Each mode corresponds with the type of card. It has been found
that there are 2 main types of cards and all other cards are
the same and just have different names or only have minor differences
such as more IO ports. As this driver is tested it will
become more clear exactly what cards are supported.
Right now these cards are known to work with the COPS driver. The
LT-200 cards work in a somewhat more limited capacity than the
DL200 cards, which work very well and are in use by many people.
TANGENT driver mode:
Tangent ATB-II, Novell NL-1000, Daystar Digital LT-200
DAYNA driver mode:
Dayna DL2000/DaynaTalk PC (Half Length), COPS LT-95,
Farallon PhoneNET PC III, Farallon PhoneNET PC II
Other cards possibly supported mode unknown though:
Dayna DL2000 (Full length)
The COPS driver defaults to using Dayna mode. To change the driver's
mode if you built a driver with dual support use board_type=1 or
board_type=2 for Dayna or Tangent with insmod.
** Operation/loading of the driver.
Use modprobe like this: /sbin/modprobe cops.o (IO #) (IRQ #)
If you do not specify any options the driver will try and use the IO = 0x240,
IRQ = 5. As of right now I would only use IRQ 5 for the card, if autoprobing.
To load multiple COPS driver Localtalk cards you can do one of the following.
insmod cops io=0x240 irq=5
insmod -o cops2 cops io=0x260 irq=3
Or in lilo.conf put something like this:
append="ether=5,0x240,lt0 ether=3,0x260,lt1"
Then bring up the interface with ifconfig. It will look something like this:
lt0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-F7-00-00-00-00-00-00-00-00
inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING NOARP MULTICAST MTU:600 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 coll:0
** Netatalk Configuration
You will need to configure atalkd with something like the following to make
it work with the cops.c driver.
* For single LTalk card use.
dummy -seed -phase 2 -net 2000 -addr 2000.10 -zone "1033"
lt0 -seed -phase 1 -net 1000 -addr 1000.50 -zone "1033"
* For multiple cards, Ethernet and LocalTalk.
eth0 -seed -phase 2 -net 3000 -addr 3000.20 -zone "1033"
lt0 -seed -phase 1 -net 1000 -addr 1000.50 -zone "1033"
* For multiple LocalTalk cards, and an Ethernet card.
* Order seems to matter here, Ethernet last.
lt0 -seed -phase 1 -net 1000 -addr 1000.10 -zone "LocalTalk1"
lt1 -seed -phase 1 -net 2000 -addr 2000.20 -zone "LocalTalk2"
eth0 -seed -phase 2 -net 3000 -addr 3000.30 -zone "EtherTalk"

View file

@ -0,0 +1,624 @@
NOTE
----
This document was contributed by Cirrus Logic for kernel 2.2.5. This version
has been updated for 2.3.48 by Andrew Morton.
Cirrus make a copy of this driver available at their website, as
described below. In general, you should use the driver version which
comes with your Linux distribution.
CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS
Linux Network Interface Driver ver. 2.00 <kernel 2.3.48>
===============================================================================
TABLE OF CONTENTS
1.0 CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS
1.1 Product Overview
1.2 Driver Description
1.2.1 Driver Name
1.2.2 File in the Driver Package
1.3 System Requirements
1.4 Licensing Information
2.0 ADAPTER INSTALLATION and CONFIGURATION
2.1 CS8900-based Adapter Configuration
2.2 CS8920-based Adapter Configuration
3.0 LOADING THE DRIVER AS A MODULE
4.0 COMPILING THE DRIVER
4.1 Compiling the Driver as a Loadable Module
4.2 Compiling the driver to support memory mode
4.3 Compiling the driver to support Rx DMA
5.0 TESTING AND TROUBLESHOOTING
5.1 Known Defects and Limitations
5.2 Testing the Adapter
5.2.1 Diagnostic Self-Test
5.2.2 Diagnostic Network Test
5.3 Using the Adapter's LEDs
5.4 Resolving I/O Conflicts
6.0 TECHNICAL SUPPORT
6.1 Contacting Cirrus Logic's Technical Support
6.2 Information Required Before Contacting Technical Support
6.3 Obtaining the Latest Driver Version
6.4 Current maintainer
6.5 Kernel boot parameters
1.0 CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS
===============================================================================
1.1 PRODUCT OVERVIEW
The CS8900-based ISA Ethernet Adapters from Cirrus Logic follow
IEEE 802.3 standards and support half or full-duplex operation in ISA bus
computers on 10 Mbps Ethernet networks. The adapters are designed for operation
in 16-bit ISA or EISA bus expansion slots and are available in
10BaseT-only or 3-media configurations (10BaseT, 10Base2, and AUI for 10Base-5
or fiber networks).
CS8920-based adapters are similar to the CS8900-based adapter with additional
features for Plug and Play (PnP) support and Wakeup Frame recognition. As
such, the configuration procedures differ somewhat between the two types of
adapters. Refer to the "Adapter Configuration" section for details on
configuring both types of adapters.
1.2 DRIVER DESCRIPTION
The CS8900/CS8920 Ethernet Adapter driver for Linux supports the Linux
v2.3.48 or greater kernel. It can be compiled directly into the kernel
or loaded at run-time as a device driver module.
1.2.1 Driver Name: cs89x0
1.2.2 Files in the Driver Archive:
The files in the driver at Cirrus' website include:
readme.txt - this file
build - batch file to compile cs89x0.c.
cs89x0.c - driver C code
cs89x0.h - driver header file
cs89x0.o - pre-compiled module (for v2.2.5 kernel)
config/Config.in - sample file to include cs89x0 driver in the kernel.
config/Makefile - sample file to include cs89x0 driver in the kernel.
config/Space.c - sample file to include cs89x0 driver in the kernel.
1.3 SYSTEM REQUIREMENTS
The following hardware is required:
* Cirrus Logic LAN (CS8900/20-based) Ethernet ISA Adapter
* IBM or IBM-compatible PC with:
* An 80386 or higher processor
* 16 bytes of contiguous IO space available between 210h - 370h
* One available IRQ (5,10,11,or 12 for the CS8900, 3-7,9-15 for CS8920).
* Appropriate cable (and connector for AUI, 10BASE-2) for your network
topology.
The following software is required:
* LINUX kernel version 2.3.48 or higher
* CS8900/20 Setup Utility (DOS-based)
* LINUX kernel sources for your kernel (if compiling into kernel)
* GNU Toolkit (gcc and make) v2.6 or above (if compiling into kernel
or a module)
1.4 LICENSING INFORMATION
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, version 1.
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
For a full copy of the GNU General Public License, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
2.0 ADAPTER INSTALLATION and CONFIGURATION
===============================================================================
Both the CS8900 and CS8920-based adapters can be configured using parameters
stored in an on-board EEPROM. You must use the DOS-based CS8900/20 Setup
Utility if you want to change the adapter's configuration in EEPROM.
When loading the driver as a module, you can specify many of the adapter's
configuration parameters on the command-line to override the EEPROM's settings
or for interface configuration when an EEPROM is not used. (CS8920-based
adapters must use an EEPROM.) See Section 3.0 LOADING THE DRIVER AS A MODULE.
Since the CS8900/20 Setup Utility is a DOS-based application, you must install
and configure the adapter in a DOS-based system using the CS8900/20 Setup
Utility before installation in the target LINUX system. (Not required if
installing a CS8900-based adapter and the default configuration is acceptable.)
2.1 CS8900-BASED ADAPTER CONFIGURATION
CS8900-based adapters shipped from Cirrus Logic have been configured
with the following "default" settings:
Operation Mode: Memory Mode
IRQ: 10
Base I/O Address: 300
Memory Base Address: D0000
Optimization: DOS Client
Transmission Mode: Half-duplex
BootProm: None
Media Type: Autodetect (3-media cards) or
10BASE-T (10BASE-T only adapter)
You should only change the default configuration settings if conflicts with
another adapter exists. To change the adapter's configuration, run the
CS8900/20 Setup Utility.
2.2 CS8920-BASED ADAPTER CONFIGURATION
CS8920-based adapters are shipped from Cirrus Logic configured as Plug
and Play (PnP) enabled. However, since the cs89x0 driver does NOT
support PnP, you must install the CS8920 adapter in a DOS-based PC and
run the CS8900/20 Setup Utility to disable PnP and configure the
adapter before installation in the target Linux system. Failure to do
this will leave the adapter inactive and the driver will be unable to
communicate with the adapter.
****************************************************************
* CS8920-BASED ADAPTERS: *
* *
* CS8920-BASED ADAPTERS ARE PLUG and PLAY ENABLED BY DEFAULT. *
* THE CS89X0 DRIVER DOES NOT SUPPORT PnP. THEREFORE, YOU MUST *
* RUN THE CS8900/20 SETUP UTILITY TO DISABLE PnP SUPPORT AND *
* TO ACTIVATE THE ADAPTER. *
****************************************************************
3.0 LOADING THE DRIVER AS A MODULE
===============================================================================
If the driver is compiled as a loadable module, you can load the driver module
with the 'modprobe' command. Many of the adapter's configuration parameters can
be specified as command-line arguments to the load command. This facility
provides a means to override the EEPROM's settings or for interface
configuration when an EEPROM is not used.
Example:
insmod cs89x0.o io=0x200 irq=0xA media=aui
This example loads the module and configures the adapter to use an IO port base
address of 200h, interrupt 10, and use the AUI media connection. The following
configuration options are available on the command line:
* io=### - specify IO address (200h-360h)
* irq=## - specify interrupt level
* use_dma=1 - Enable DMA
* dma=# - specify dma channel (Driver is compiled to support
Rx DMA only)
* dmasize=# (16 or 64) - DMA size 16K or 64K. Default value is set to 16.
* media=rj45 - specify media type
or media=bnc
or media=aui
or media=auto
* duplex=full - specify forced half/full/autonegotiate duplex
or duplex=half
or duplex=auto
* debug=# - debug level (only available if the driver was compiled
for debugging)
NOTES:
a) If an EEPROM is present, any specified command-line parameter
will override the corresponding configuration value stored in
EEPROM.
b) The "io" parameter must be specified on the command-line.
c) The driver's hardware probe routine is designed to avoid
writing to I/O space until it knows that there is a cs89x0
card at the written addresses. This could cause problems
with device probing. To avoid this behaviour, add one
to the `io=' module parameter. This doesn't actually change
the I/O address, but it is a flag to tell the driver
to partially initialise the hardware before trying to
identify the card. This could be dangerous if you are
not sure that there is a cs89x0 card at the provided address.
For example, to scan for an adapter located at IO base 0x300,
specify an IO address of 0x301.
d) The "duplex=auto" parameter is only supported for the CS8920.
e) The minimum command-line configuration required if an EEPROM is
not present is:
io
irq
media type (no autodetect)
f) The following additional parameters are CS89XX defaults (values
used with no EEPROM or command-line argument).
* DMA Burst = enabled
* IOCHRDY Enabled = enabled
* UseSA = enabled
* CS8900 defaults to half-duplex if not specified on command-line
* CS8920 defaults to autoneg if not specified on command-line
* Use reset defaults for other config parameters
* dma_mode = 0
g) You can use ifconfig to set the adapter's Ethernet address.
h) Many Linux distributions use the 'modprobe' command to load
modules. This program uses the '/etc/conf.modules' file to
determine configuration information which is passed to a driver
module when it is loaded. All the configuration options which are
described above may be placed within /etc/conf.modules.
For example:
> cat /etc/conf.modules
...
alias eth0 cs89x0
options cs89x0 io=0x0200 dma=5 use_dma=1
...
In this example we are telling the module system that the
ethernet driver for this machine should use the cs89x0 driver. We
are asking 'modprobe' to pass the 'io', 'dma' and 'use_dma'
arguments to the driver when it is loaded.
i) Cirrus recommend that the cs89x0 use the ISA DMA channels 5, 6 or
7. You will probably find that other DMA channels will not work.
j) The cs89x0 supports DMA for receiving only. DMA mode is
significantly more efficient. Flooding a 400 MHz Celeron machine
with large ping packets consumes 82% of its CPU capacity in non-DMA
mode. With DMA this is reduced to 45%.
k) If your Linux kernel was compiled with inbuilt plug-and-play
support you will be able to find information about the cs89x0 card
with the command
cat /proc/isapnp
l) If during DMA operation you find erratic behavior or network data
corruption you should use your PC's BIOS to slow the EISA bus clock.
m) If the cs89x0 driver is compiled directly into the kernel
(non-modular) then its I/O address is automatically determined by
ISA bus probing. The IRQ number, media options, etc are determined
from the card's EEPROM.
n) If the cs89x0 driver is compiled directly into the kernel, DMA
mode may be selected by providing the kernel with a boot option
'cs89x0_dma=N' where 'N' is the desired DMA channel number (5, 6 or 7).
Kernel boot options may be provided on the LILO command line:
LILO boot: linux cs89x0_dma=5
or they may be placed in /etc/lilo.conf:
image=/boot/bzImage-2.3.48
append="cs89x0_dma=5"
label=linux
root=/dev/hda5
read-only
The DMA Rx buffer size is hardwired to 16 kbytes in this mode.
(64k mode is not available).
4.0 COMPILING THE DRIVER
===============================================================================
The cs89x0 driver can be compiled directly into the kernel or compiled into
a loadable device driver module.
4.1 COMPILING THE DRIVER AS A LOADABLE MODULE
To compile the driver into a loadable module, use the following command
(single command line, without quotes):
"gcc -D__KERNEL__ -I/usr/src/linux/include -I/usr/src/linux/net/inet -Wall
-Wstrict-prototypes -O2 -fomit-frame-pointer -DMODULE -DCONFIG_MODVERSIONS
-c cs89x0.c"
4.2 COMPILING THE DRIVER TO SUPPORT MEMORY MODE
Support for memory mode was not carried over into the 2.3 series kernels.
4.3 COMPILING THE DRIVER TO SUPPORT Rx DMA
The compile-time optionality for DMA was removed in the 2.3 kernel
series. DMA support is now unconditionally part of the driver. It is
enabled by the 'use_dma=1' module option.
5.0 TESTING AND TROUBLESHOOTING
===============================================================================
5.1 KNOWN DEFECTS and LIMITATIONS
Refer to the RELEASE.TXT file distributed as part of this archive for a list of
known defects, driver limitations, and work arounds.
5.2 TESTING THE ADAPTER
Once the adapter has been installed and configured, the diagnostic option of
the CS8900/20 Setup Utility can be used to test the functionality of the
adapter and its network connection. Use the diagnostics 'Self Test' option to
test the functionality of the adapter with the hardware configuration you have
assigned. You can use the diagnostics 'Network Test' to test the ability of the
adapter to communicate across the Ethernet with another PC equipped with a
CS8900/20-based adapter card (it must also be running the CS8900/20 Setup
Utility).
NOTE: The Setup Utility's diagnostics are designed to run in a
DOS-only operating system environment. DO NOT run the diagnostics
from a DOS or command prompt session under Windows 95, Windows NT,
OS/2, or other operating system.
To run the diagnostics tests on the CS8900/20 adapter:
1.) Boot DOS on the PC and start the CS8900/20 Setup Utility.
2.) The adapter's current configuration is displayed. Hit the ENTER key to
get to the main menu.
4.) Select 'Diagnostics' (ALT-G) from the main menu.
* Select 'Self-Test' to test the adapter's basic functionality.
* Select 'Network Test' to test the network connection and cabling.
5.2.1 DIAGNOSTIC SELF-TEST
The diagnostic self-test checks the adapter's basic functionality as well as
its ability to communicate across the ISA bus based on the system resources
assigned during hardware configuration. The following tests are performed:
* IO Register Read/Write Test
The IO Register Read/Write test insures that the CS8900/20 can be
accessed in IO mode, and that the IO base address is correct.
* Shared Memory Test
The Shared Memory test insures the CS8900/20 can be accessed in memory
mode and that the range of memory addresses assigned does not conflict
with other devices in the system.
* Interrupt Test
The Interrupt test insures there are no conflicts with the assigned IRQ
signal.
* EEPROM Test
The EEPROM test insures the EEPROM can be read.
* Chip RAM Test
The Chip RAM test insures the 4K of memory internal to the CS8900/20 is
working properly.
* Internal Loop-back Test
The Internal Loop Back test insures the adapter's transmitter and
receiver are operating properly. If this test fails, make sure the
adapter's cable is connected to the network (check for LED activity for
example).
* Boot PROM Test
The Boot PROM test insures the Boot PROM is present, and can be read.
Failure indicates the Boot PROM was not successfully read due to a
hardware problem or due to a conflicts on the Boot PROM address
assignment. (Test only applies if the adapter is configured to use the
Boot PROM option.)
Failure of a test item indicates a possible system resource conflict with
another device on the ISA bus. In this case, you should use the Manual Setup
option to reconfigure the adapter by selecting a different value for the system
resource that failed.
5.2.2 DIAGNOSTIC NETWORK TEST
The Diagnostic Network Test verifies a working network connection by
transferring data between two CS8900/20 adapters installed in different PCs
on the same network. (Note: the diagnostic network test should not be run
between two nodes across a router.)
This test requires that each of the two PCs have a CS8900/20-based adapter
installed and have the CS8900/20 Setup Utility running. The first PC is
configured as a Responder and the other PC is configured as an Initiator.
Once the Initiator is started, it sends data frames to the Responder which
returns the frames to the Initiator.
The total number of frames received and transmitted are displayed on the
Initiator's display, along with a count of the number of frames received and
transmitted OK or in error. The test can be terminated anytime by the user at
either PC.
To setup the Diagnostic Network Test:
1.) Select a PC with a CS8900/20-based adapter and a known working network
connection to act as the Responder. Run the CS8900/20 Setup Utility
and select 'Diagnostics -> Network Test -> Responder' from the main
menu. Hit ENTER to start the Responder.
2.) Return to the PC with the CS8900/20-based adapter you want to test and
start the CS8900/20 Setup Utility.
3.) From the main menu, Select 'Diagnostic -> Network Test -> Initiator'.
Hit ENTER to start the test.
You may stop the test on the Initiator at any time while allowing the Responder
to continue running. In this manner, you can move to additional PCs and test
them by starting the Initiator on another PC without having to stop/start the
Responder.
5.3 USING THE ADAPTER'S LEDs
The 2 and 3-media adapters have two LEDs visible on the back end of the board
located near the 10Base-T connector.
Link Integrity LED: A "steady" ON of the green LED indicates a valid 10Base-T
connection. (Only applies to 10Base-T. The green LED has no significance for
a 10Base-2 or AUI connection.)
TX/RX LED: The yellow LED lights briefly each time the adapter transmits or
receives data. (The yellow LED will appear to "flicker" on a typical network.)
5.4 RESOLVING I/O CONFLICTS
An IO conflict occurs when two or more adapter use the same ISA resource (IO
address, memory address or IRQ). You can usually detect an IO conflict in one
of four ways after installing and or configuring the CS8900/20-based adapter:
1.) The system does not boot properly (or at all).
2.) The driver cannot communicate with the adapter, reporting an "Adapter
not found" error message.
3.) You cannot connect to the network or the driver will not load.
4.) If you have configured the adapter to run in memory mode but the driver
reports it is using IO mode when loading, this is an indication of a
memory address conflict.
If an IO conflict occurs, run the CS8900/20 Setup Utility and perform a
diagnostic self-test. Normally, the ISA resource in conflict will fail the
self-test. If so, reconfigure the adapter selecting another choice for the
resource in conflict. Run the diagnostics again to check for further IO
conflicts.
In some cases, such as when the PC will not boot, it may be necessary to remove
the adapter and reconfigure it by installing it in another PC to run the
CS8900/20 Setup Utility. Once reinstalled in the target system, run the
diagnostics self-test to ensure the new configuration is free of conflicts
before loading the driver again.
When manually configuring the adapter, keep in mind the typical ISA system
resource usage as indicated in the tables below.
I/O Address Device IRQ Device
----------- -------- --- --------
200-20F Game I/O adapter 3 COM2, Bus Mouse
230-23F Bus Mouse 4 COM1
270-27F LPT3: third parallel port 5 LPT2
2F0-2FF COM2: second serial port 6 Floppy Disk controller
320-32F Fixed disk controller 7 LPT1
8 Real-time Clock
9 EGA/VGA display adapter
12 Mouse (PS/2)
Memory Address Device 13 Math Coprocessor
-------------- --------------------- 14 Hard Disk controller
A000-BFFF EGA Graphics Adapter
A000-C7FF VGA Graphics Adapter
B000-BFFF Mono Graphics Adapter
B800-BFFF Color Graphics Adapter
E000-FFFF AT BIOS
6.0 TECHNICAL SUPPORT
===============================================================================
6.1 CONTACTING CIRRUS LOGIC'S TECHNICAL SUPPORT
Cirrus Logic's CS89XX Technical Application Support can be reached at:
Telephone :(800) 888-5016 (from inside U.S. and Canada)
:(512) 442-7555 (from outside the U.S. and Canada)
Fax :(512) 912-3871
Email :ethernet@crystal.cirrus.com
WWW :http://www.cirrus.com
6.2 INFORMATION REQUIRED BEFORE CONTACTING TECHNICAL SUPPORT
Before contacting Cirrus Logic for technical support, be prepared to provide as
Much of the following information as possible.
1.) Adapter type (CRD8900, CDB8900, CDB8920, etc.)
2.) Adapter configuration
* IO Base, Memory Base, IO or memory mode enabled, IRQ, DMA channel
* Plug and Play enabled/disabled (CS8920-based adapters only)
* Configured for media auto-detect or specific media type (which type).
3.) PC System's Configuration
* Plug and Play system (yes/no)
* BIOS (make and version)
* System make and model
* CPU (type and speed)
* System RAM
* SCSI Adapter
4.) Software
* CS89XX driver and version
* Your network operating system and version
* Your system's OS version
* Version of all protocol support files
5.) Any Error Message displayed.
6.3 OBTAINING THE LATEST DRIVER VERSION
You can obtain the latest CS89XX drivers and support software from Cirrus Logic's
Web site. You can also contact Cirrus Logic's Technical Support (email:
ethernet@crystal.cirrus.com) and request that you be registered for automatic
software-update notification.
Cirrus Logic maintains a web page at http://www.cirrus.com with the
latest drivers and technical publications.
6.4 Current maintainer
In February 2000 the maintenance of this driver was assumed by Andrew
Morton.
6.5 Kernel module parameters
For use in embedded environments with no cs89x0 EEPROM, the kernel boot
parameter `cs89x0_media=' has been implemented. Usage is:
cs89x0_media=rj45 or
cs89x0_media=aui or
cs89x0_media=bnc

View file

@ -0,0 +1,48 @@
#!/usr/bin/env python
# Copyright 2009 Simon Arlott
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version.
#
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
# more details.
#
# You should have received a copy of the GNU General Public License along with
# this program; if not, write to the Free Software Foundation, Inc., 59
# Temple Place - Suite 330, Boston, MA 02111-1307, USA.
#
# Usage: cxacru-cf.py < cxacru-cf.bin
# Output: values string suitable for the sysfs adsl_config attribute
#
# Warning: cxacru-cf.bin with MD5 hash cdbac2689969d5ed5d4850f117702110
# contains mis-aligned values which will stop the modem from being able
# to make a connection. If the first and last two bytes are removed then
# the values become valid, but the modulation will be forced to ANSI
# T1.413 only which may not be appropriate.
#
# The original binary format is a packed list of le32 values.
import sys
import struct
i = 0
while True:
buf = sys.stdin.read(4)
if len(buf) == 0:
break
elif len(buf) != 4:
sys.stdout.write("\n")
sys.stderr.write("Error: read {0} not 4 bytes\n".format(len(buf)))
sys.exit(1)
if i > 0:
sys.stdout.write(" ")
sys.stdout.write("{0:x}={1}".format(i, struct.unpack("<I", buf)[0]))
i += 1
sys.stdout.write("\n")

View file

@ -0,0 +1,100 @@
Firmware is required for this device: http://accessrunner.sourceforge.net/
While it is capable of managing/maintaining the ADSL connection without the
module loaded, the device will sometimes stop responding after unloading the
driver and it is necessary to unplug/remove power to the device to fix this.
Note: support for cxacru-cf.bin has been removed. It was not loaded correctly
so it had no effect on the device configuration. Fixing it could have stopped
existing devices working when an invalid configuration is supplied.
There is a script cxacru-cf.py to convert an existing file to the sysfs form.
Detected devices will appear as ATM devices named "cxacru". In /sys/class/atm/
these are directories named cxacruN where N is the device number. A symlink
named device points to the USB interface device's directory which contains
several sysfs attribute files for retrieving device statistics:
* adsl_controller_version
* adsl_headend
* adsl_headend_environment
Information about the remote headend.
* adsl_config
Configuration writing interface.
Write parameters in hexadecimal format <index>=<value>,
separated by whitespace, e.g.:
"1=0 a=5"
Up to 7 parameters at a time will be sent and the modem will restart
the ADSL connection when any value is set. These are logged for future
reference.
* downstream_attenuation (dB)
* downstream_bits_per_frame
* downstream_rate (kbps)
* downstream_snr_margin (dB)
Downstream stats.
* upstream_attenuation (dB)
* upstream_bits_per_frame
* upstream_rate (kbps)
* upstream_snr_margin (dB)
* transmitter_power (dBm/Hz)
Upstream stats.
* downstream_crc_errors
* downstream_fec_errors
* downstream_hec_errors
* upstream_crc_errors
* upstream_fec_errors
* upstream_hec_errors
Error counts.
* line_startable
Indicates that ADSL support on the device
is/can be enabled, see adsl_start.
* line_status
"initialising"
"down"
"attempting to activate"
"training"
"channel analysis"
"exchange"
"waiting"
"up"
Changes between "down" and "attempting to activate"
if there is no signal.
* link_status
"not connected"
"connected"
"lost"
* mac_address
* modulation
"" (when not connected)
"ANSI T1.413"
"ITU-T G.992.1 (G.DMT)"
"ITU-T G.992.2 (G.LITE)"
* startup_attempts
Count of total attempts to initialise ADSL.
To enable/disable ADSL, the following can be written to the adsl_state file:
"start"
"stop
"restart" (stops, waits 1.5s, then starts)
"poll" (used to resume status polling if it was disabled due to failure)
Changes in adsl/line state are reported via kernel log messages:
[4942145.150704] ATM dev 0: ADSL state: running
[4942243.663766] ATM dev 0: ADSL line: down
[4942249.665075] ATM dev 0: ADSL line: attempting to activate
[4942253.654954] ATM dev 0: ADSL line: training
[4942255.666387] ATM dev 0: ADSL line: channel analysis
[4942259.656262] ATM dev 0: ADSL line: exchange
[2635357.696901] ATM dev 0: ADSL line: up (8128 kb/s down | 832 kb/s up)

View file

@ -0,0 +1,352 @@
Chelsio N210 10Gb Ethernet Network Controller
Driver Release Notes for Linux
Version 2.1.1
June 20, 2005
CONTENTS
========
INTRODUCTION
FEATURES
PERFORMANCE
DRIVER MESSAGES
KNOWN ISSUES
SUPPORT
INTRODUCTION
============
This document describes the Linux driver for Chelsio 10Gb Ethernet Network
Controller. This driver supports the Chelsio N210 NIC and is backward
compatible with the Chelsio N110 model 10Gb NICs.
FEATURES
========
Adaptive Interrupts (adaptive-rx)
---------------------------------
This feature provides an adaptive algorithm that adjusts the interrupt
coalescing parameters, allowing the driver to dynamically adapt the latency
settings to achieve the highest performance during various types of network
load.
The interface used to control this feature is ethtool. Please see the
ethtool manpage for additional usage information.
By default, adaptive-rx is disabled.
To enable adaptive-rx:
ethtool -C <interface> adaptive-rx on
To disable adaptive-rx, use ethtool:
ethtool -C <interface> adaptive-rx off
After disabling adaptive-rx, the timer latency value will be set to 50us.
You may set the timer latency after disabling adaptive-rx:
ethtool -C <interface> rx-usecs <microseconds>
An example to set the timer latency value to 100us on eth0:
ethtool -C eth0 rx-usecs 100
You may also provide a timer latency value while disabling adaptive-rx:
ethtool -C <interface> adaptive-rx off rx-usecs <microseconds>
If adaptive-rx is disabled and a timer latency value is specified, the timer
will be set to the specified value until changed by the user or until
adaptive-rx is enabled.
To view the status of the adaptive-rx and timer latency values:
ethtool -c <interface>
TCP Segmentation Offloading (TSO) Support
-----------------------------------------
This feature, also known as "large send", enables a system's protocol stack
to offload portions of outbound TCP processing to a network interface card
thereby reducing system CPU utilization and enhancing performance.
The interface used to control this feature is ethtool version 1.8 or higher.
Please see the ethtool manpage for additional usage information.
By default, TSO is enabled.
To disable TSO:
ethtool -K <interface> tso off
To enable TSO:
ethtool -K <interface> tso on
To view the status of TSO:
ethtool -k <interface>
PERFORMANCE
===========
The following information is provided as an example of how to change system
parameters for "performance tuning" an what value to use. You may or may not
want to change these system parameters, depending on your server/workstation
application. Doing so is not warranted in any way by Chelsio Communications,
and is done at "YOUR OWN RISK". Chelsio will not be held responsible for loss
of data or damage to equipment.
Your distribution may have a different way of doing things, or you may prefer
a different method. These commands are shown only to provide an example of
what to do and are by no means definitive.
Making any of the following system changes will only last until you reboot
your system. You may want to write a script that runs at boot-up which
includes the optimal settings for your system.
Setting PCI Latency Timer:
setpci -d 1425:* 0x0c.l=0x0000F800
Disabling TCP timestamp:
sysctl -w net.ipv4.tcp_timestamps=0
Disabling SACK:
sysctl -w net.ipv4.tcp_sack=0
Setting large number of incoming connection requests:
sysctl -w net.ipv4.tcp_max_syn_backlog=3000
Setting maximum receive socket buffer size:
sysctl -w net.core.rmem_max=1024000
Setting maximum send socket buffer size:
sysctl -w net.core.wmem_max=1024000
Set smp_affinity (on a multiprocessor system) to a single CPU:
echo 1 > /proc/irq/<interrupt_number>/smp_affinity
Setting default receive socket buffer size:
sysctl -w net.core.rmem_default=524287
Setting default send socket buffer size:
sysctl -w net.core.wmem_default=524287
Setting maximum option memory buffers:
sysctl -w net.core.optmem_max=524287
Setting maximum backlog (# of unprocessed packets before kernel drops):
sysctl -w net.core.netdev_max_backlog=300000
Setting TCP read buffers (min/default/max):
sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
Setting TCP write buffers (min/pressure/max):
sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"
Setting TCP buffer space (min/pressure/max):
sysctl -w net.ipv4.tcp_mem="10000000 10000000 10000000"
TCP window size for single connections:
The receive buffer (RX_WINDOW) size must be at least as large as the
Bandwidth-Delay Product of the communication link between the sender and
receiver. Due to the variations of RTT, you may want to increase the buffer
size up to 2 times the Bandwidth-Delay Product. Reference page 289 of
"TCP/IP Illustrated, Volume 1, The Protocols" by W. Richard Stevens.
At 10Gb speeds, use the following formula:
RX_WINDOW >= 1.25MBytes * RTT(in milliseconds)
Example for RTT with 100us: RX_WINDOW = (1,250,000 * 0.1) = 125,000
RX_WINDOW sizes of 256KB - 512KB should be sufficient.
Setting the min, max, and default receive buffer (RX_WINDOW) size:
sysctl -w net.ipv4.tcp_rmem="<min> <default> <max>"
TCP window size for multiple connections:
The receive buffer (RX_WINDOW) size may be calculated the same as single
connections, but should be divided by the number of connections. The
smaller window prevents congestion and facilitates better pacing,
especially if/when MAC level flow control does not work well or when it is
not supported on the machine. Experimentation may be necessary to attain
the correct value. This method is provided as a starting point for the
correct receive buffer size.
Setting the min, max, and default receive buffer (RX_WINDOW) size is
performed in the same manner as single connection.
DRIVER MESSAGES
===============
The following messages are the most common messages logged by syslog. These
may be found in /var/log/messages.
Driver up:
Chelsio Network Driver - version 2.1.1
NIC detected:
eth#: Chelsio N210 1x10GBaseX NIC (rev #), PCIX 133MHz/64-bit
Link up:
eth#: link is up at 10 Gbps, full duplex
Link down:
eth#: link is down
KNOWN ISSUES
============
These issues have been identified during testing. The following information
is provided as a workaround to the problem. In some cases, this problem is
inherent to Linux or to a particular Linux Distribution and/or hardware
platform.
1. Large number of TCP retransmits on a multiprocessor (SMP) system.
On a system with multiple CPUs, the interrupt (IRQ) for the network
controller may be bound to more than one CPU. This will cause TCP
retransmits if the packet data were to be split across different CPUs
and re-assembled in a different order than expected.
To eliminate the TCP retransmits, set smp_affinity on the particular
interrupt to a single CPU. You can locate the interrupt (IRQ) used on
the N110/N210 by using ifconfig:
ifconfig <dev_name> | grep Interrupt
Set the smp_affinity to a single CPU:
echo 1 > /proc/irq/<interrupt_number>/smp_affinity
It is highly suggested that you do not run the irqbalance daemon on your
system, as this will change any smp_affinity setting you have applied.
The irqbalance daemon runs on a 10 second interval and binds interrupts
to the least loaded CPU determined by the daemon. To disable this daemon:
chkconfig --level 2345 irqbalance off
By default, some Linux distributions enable the kernel feature,
irqbalance, which performs the same function as the daemon. To disable
this feature, add the following line to your bootloader:
noirqbalance
Example using the Grub bootloader:
title Red Hat Enterprise Linux AS (2.4.21-27.ELsmp)
root (hd0,0)
kernel /vmlinuz-2.4.21-27.ELsmp ro root=/dev/hda3 noirqbalance
initrd /initrd-2.4.21-27.ELsmp.img
2. After running insmod, the driver is loaded and the incorrect network
interface is brought up without running ifup.
When using 2.4.x kernels, including RHEL kernels, the Linux kernel
invokes a script named "hotplug". This script is primarily used to
automatically bring up USB devices when they are plugged in, however,
the script also attempts to automatically bring up a network interface
after loading the kernel module. The hotplug script does this by scanning
the ifcfg-eth# config files in /etc/sysconfig/network-scripts, looking
for HWADDR=<mac_address>.
If the hotplug script does not find the HWADDRR within any of the
ifcfg-eth# files, it will bring up the device with the next available
interface name. If this interface is already configured for a different
network card, your new interface will have incorrect IP address and
network settings.
To solve this issue, you can add the HWADDR=<mac_address> key to the
interface config file of your network controller.
To disable this "hotplug" feature, you may add the driver (module name)
to the "blacklist" file located in /etc/hotplug. It has been noted that
this does not work for network devices because the net.agent script
does not use the blacklist file. Simply remove, or rename, the net.agent
script located in /etc/hotplug to disable this feature.
3. Transport Protocol (TP) hangs when running heavy multi-connection traffic
on an AMD Opteron system with HyperTransport PCI-X Tunnel chipset.
If your AMD Opteron system uses the AMD-8131 HyperTransport PCI-X Tunnel
chipset, you may experience the "133-Mhz Mode Split Completion Data
Corruption" bug identified by AMD while using a 133Mhz PCI-X card on the
bus PCI-X bus.
AMD states, "Under highly specific conditions, the AMD-8131 PCI-X Tunnel
can provide stale data via split completion cycles to a PCI-X card that
is operating at 133 Mhz", causing data corruption.
AMD's provides three workarounds for this problem, however, Chelsio
recommends the first option for best performance with this bug:
For 133Mhz secondary bus operation, limit the transaction length and
the number of outstanding transactions, via BIOS configuration
programming of the PCI-X card, to the following:
Data Length (bytes): 1k
Total allowed outstanding transactions: 2
Please refer to AMD 8131-HT/PCI-X Errata 26310 Rev 3.08 August 2004,
section 56, "133-MHz Mode Split Completion Data Corruption" for more
details with this bug and workarounds suggested by AMD.
It may be possible to work outside AMD's recommended PCI-X settings, try
increasing the Data Length to 2k bytes for increased performance. If you
have issues with these settings, please revert to the "safe" settings
and duplicate the problem before submitting a bug or asking for support.
NOTE: The default setting on most systems is 8 outstanding transactions
and 2k bytes data length.
4. On multiprocessor systems, it has been noted that an application which
is handling 10Gb networking can switch between CPUs causing degraded
and/or unstable performance.
If running on an SMP system and taking performance measurements, it
is suggested you either run the latest netperf-2.4.0+ or use a binding
tool such as Tim Hockin's procstate utilities (runon)
<http://www.hockin.org/~thockin/procstate/>.
Binding netserver and netperf (or other applications) to particular
CPUs will have a significant difference in performance measurements.
You may need to experiment which CPU to bind the application to in
order to achieve the best performance for your system.
If you are developing an application designed for 10Gb networking,
please keep in mind you may want to look at kernel functions
sched_setaffinity & sched_getaffinity to bind your application.
If you are just running user-space applications such as ftp, telnet,
etc., you may want to try the runon tool provided by Tim Hockin's
procstate utility. You could also try binding the interface to a
particular CPU: runon 0 ifup eth0
SUPPORT
=======
If you have problems with the software or hardware, please contact our
customer support team via email at support@chelsio.com or check our website
at http://www.chelsio.com
===============================================================================
Chelsio Communications
370 San Aleso Ave.
Suite 100
Sunnyvale, CA 94085
http://www.chelsio.com
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License, version 2, as
published by the Free Software Foundation.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
Copyright (c) 2003-2005 Chelsio Communications. All rights reserved.
===============================================================================

View file

@ -0,0 +1,207 @@
DCCP protocol
=============
Contents
========
- Introduction
- Missing features
- Socket options
- Sysctl variables
- IOCTLs
- Other tunables
- Notes
Introduction
============
Datagram Congestion Control Protocol (DCCP) is an unreliable, connection
oriented protocol designed to solve issues present in UDP and TCP, particularly
for real-time and multimedia (streaming) traffic.
It divides into a base protocol (RFC 4340) and pluggable congestion control
modules called CCIDs. Like pluggable TCP congestion control, at least one CCID
needs to be enabled in order for the protocol to function properly. In the Linux
implementation, this is the TCP-like CCID2 (RFC 4341). Additional CCIDs, such as
the TCP-friendly CCID3 (RFC 4342), are optional.
For a brief introduction to CCIDs and suggestions for choosing a CCID to match
given applications, see section 10 of RFC 4340.
It has a base protocol and pluggable congestion control IDs (CCIDs).
DCCP is a Proposed Standard (RFC 2026), and the homepage for DCCP as a protocol
is at http://www.ietf.org/html.charters/dccp-charter.html
Missing features
================
The Linux DCCP implementation does not currently support all the features that are
specified in RFCs 4340...42.
The known bugs are at:
http://www.linuxfoundation.org/collaborate/workgroups/networking/todo#DCCP
For more up-to-date versions of the DCCP implementation, please consider using
the experimental DCCP test tree; instructions for checking this out are on:
http://www.linuxfoundation.org/collaborate/workgroups/networking/dccp_testing#Experimental_DCCP_source_tree
Socket options
==============
DCCP_SOCKOPT_QPOLICY_ID sets the dequeuing policy for outgoing packets. It takes
a policy ID as argument and can only be set before the connection (i.e. changes
during an established connection are not supported). Currently, two policies are
defined: the "simple" policy (DCCPQ_POLICY_SIMPLE), which does nothing special,
and a priority-based variant (DCCPQ_POLICY_PRIO). The latter allows to pass an
u32 priority value as ancillary data to sendmsg(), where higher numbers indicate
a higher packet priority (similar to SO_PRIORITY). This ancillary data needs to
be formatted using a cmsg(3) message header filled in as follows:
cmsg->cmsg_level = SOL_DCCP;
cmsg->cmsg_type = DCCP_SCM_PRIORITY;
cmsg->cmsg_len = CMSG_LEN(sizeof(uint32_t)); /* or CMSG_LEN(4) */
DCCP_SOCKOPT_QPOLICY_TXQLEN sets the maximum length of the output queue. A zero
value is always interpreted as unbounded queue length. If different from zero,
the interpretation of this parameter depends on the current dequeuing policy
(see above): the "simple" policy will enforce a fixed queue size by returning
EAGAIN, whereas the "prio" policy enforces a fixed queue length by dropping the
lowest-priority packet first. The default value for this parameter is
initialised from /proc/sys/net/dccp/default/tx_qlen.
DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of
service codes (RFC 4340, sec. 8.1.2); if this socket option is not set,
the socket will fall back to 0 (which means that no meaningful service code
is present). On active sockets this is set before connect(); specifying more
than one code has no effect (all subsequent service codes are ignored). The
case is different for passive sockets, where multiple service codes (up to 32)
can be set before calling bind().
DCCP_SOCKOPT_GET_CUR_MPS is read-only and retrieves the current maximum packet
size (application payload size) in bytes, see RFC 4340, section 14.
DCCP_SOCKOPT_AVAILABLE_CCIDS is also read-only and returns the list of CCIDs
supported by the endpoint. The option value is an array of type uint8_t whose
size is passed as option length. The minimum array size is 4 elements, the
value returned in the optlen argument always reflects the true number of
built-in CCIDs.
DCCP_SOCKOPT_CCID is write-only and sets both the TX and RX CCIDs at the same
time, combining the operation of the next two socket options. This option is
preferable over the latter two, since often applications will use the same
type of CCID for both directions; and mixed use of CCIDs is not currently well
understood. This socket option takes as argument at least one uint8_t value, or
an array of uint8_t values, which must match available CCIDS (see above). CCIDs
must be registered on the socket before calling connect() or listen().
DCCP_SOCKOPT_TX_CCID is read/write. It returns the current CCID (if set) or sets
the preference list for the TX CCID, using the same format as DCCP_SOCKOPT_CCID.
Please note that the getsockopt argument type here is `int', not uint8_t.
DCCP_SOCKOPT_RX_CCID is analogous to DCCP_SOCKOPT_TX_CCID, but for the RX CCID.
DCCP_SOCKOPT_SERVER_TIMEWAIT enables the server (listening socket) to hold
timewait state when closing the connection (RFC 4340, 8.3). The usual case is
that the closing server sends a CloseReq, whereupon the client holds timewait
state. When this boolean socket option is on, the server sends a Close instead
and will enter TIMEWAIT. This option must be set after accept() returns.
DCCP_SOCKOPT_SEND_CSCOV and DCCP_SOCKOPT_RECV_CSCOV are used for setting the
partial checksum coverage (RFC 4340, sec. 9.2). The default is that checksums
always cover the entire packet and that only fully covered application data is
accepted by the receiver. Hence, when using this feature on the sender, it must
be enabled at the receiver, too with suitable choice of CsCov.
DCCP_SOCKOPT_SEND_CSCOV sets the sender checksum coverage. Values in the
range 0..15 are acceptable. The default setting is 0 (full coverage),
values between 1..15 indicate partial coverage.
DCCP_SOCKOPT_RECV_CSCOV is for the receiver and has a different meaning: it
sets a threshold, where again values 0..15 are acceptable. The default
of 0 means that all packets with a partial coverage will be discarded.
Values in the range 1..15 indicate that packets with minimally such a
coverage value are also acceptable. The higher the number, the more
restrictive this setting (see [RFC 4340, sec. 9.2.1]). Partial coverage
settings are inherited to the child socket after accept().
The following two options apply to CCID 3 exclusively and are getsockopt()-only.
In either case, a TFRC info struct (defined in <linux/tfrc.h>) is returned.
DCCP_SOCKOPT_CCID_RX_INFO
Returns a `struct tfrc_rx_info' in optval; the buffer for optval and
optlen must be set to at least sizeof(struct tfrc_rx_info).
DCCP_SOCKOPT_CCID_TX_INFO
Returns a `struct tfrc_tx_info' in optval; the buffer for optval and
optlen must be set to at least sizeof(struct tfrc_tx_info).
On unidirectional connections it is useful to close the unused half-connection
via shutdown (SHUT_WR or SHUT_RD): this will reduce per-packet processing costs.
Sysctl variables
================
Several DCCP default parameters can be managed by the following sysctls
(sysctl net.dccp.default or /proc/sys/net/dccp/default):
request_retries
The number of active connection initiation retries (the number of
Requests minus one) before timing out. In addition, it also governs
the behaviour of the other, passive side: this variable also sets
the number of times DCCP repeats sending a Response when the initial
handshake does not progress from RESPOND to OPEN (i.e. when no Ack
is received after the initial Request). This value should be greater
than 0, suggested is less than 10. Analogue of tcp_syn_retries.
retries1
How often a DCCP Response is retransmitted until the listening DCCP
side considers its connecting peer dead. Analogue of tcp_retries1.
retries2
The number of times a general DCCP packet is retransmitted. This has
importance for retransmitted acknowledgments and feature negotiation,
data packets are never retransmitted. Analogue of tcp_retries2.
tx_ccid = 2
Default CCID for the sender-receiver half-connection. Depending on the
choice of CCID, the Send Ack Vector feature is enabled automatically.
rx_ccid = 2
Default CCID for the receiver-sender half-connection; see tx_ccid.
seq_window = 100
The initial sequence window (sec. 7.5.2) of the sender. This influences
the local ackno validity and the remote seqno validity windows (7.5.1).
Values in the range Wmin = 32 (RFC 4340, 7.5.2) up to 2^32-1 can be set.
tx_qlen = 5
The size of the transmit buffer in packets. A value of 0 corresponds
to an unbounded transmit buffer.
sync_ratelimit = 125 ms
The timeout between subsequent DCCP-Sync packets sent in response to
sequence-invalid packets on the same socket (RFC 4340, 7.5.4). The unit
of this parameter is milliseconds; a value of 0 disables rate-limiting.
IOCTLS
======
FIONREAD
Works as in udp(7): returns in the `int' argument pointer the size of
the next pending datagram in bytes, or 0 when no datagram is pending.
Other tunables
==============
Per-route rto_min support
CCID-2 supports the RTAX_RTO_MIN per-route setting for the minimum value
of the RTO timer. This setting can be modified via the 'rto_min' option
of iproute2; for example:
> ip route change 10.0.0.0/24 rto_min 250j dev wlan0
> ip route add 10.0.0.254/32 rto_min 800j dev wlan0
> ip route show dev wlan0
CCID-3 also supports the rto_min setting: it is used to define the lower
bound for the expiry of the nofeedback timer. This can be useful on LANs
with very low RTTs (e.g., loopback, Gbit ethernet).
Notes
=====
DCCP does not travel through NAT successfully at present on many boxes. This is
because the checksum covers the pseudo-header as per TCP and UDP. Linux NAT
support for DCCP has been added.

View file

@ -0,0 +1,43 @@
DCTCP (DataCenter TCP)
----------------------
DCTCP is an enhancement to the TCP congestion control algorithm for data
center networks and leverages Explicit Congestion Notification (ECN) in
the data center network to provide multi-bit feedback to the end hosts.
To enable it on end hosts:
sysctl -w net.ipv4.tcp_congestion_control=dctcp
All switches in the data center network running DCTCP must support ECN
marking and be configured for marking when reaching defined switch buffer
thresholds. The default ECN marking threshold heuristic for DCTCP on
switches is 20 packets (30KB) at 1Gbps, and 65 packets (~100KB) at 10Gbps,
but might need further careful tweaking.
For more details, see below documents:
Paper:
The algorithm is further described in detail in the following two
SIGCOMM/SIGMETRICS papers:
i) Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye,
Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan:
"Data Center TCP (DCTCP)", Data Center Networks session
Proc. ACM SIGCOMM, New Delhi, 2010.
http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp-final.pdf
http://www.sigcomm.org/ccr/papers/2010/October/1851275.1851192
ii) Mohammad Alizadeh, Adel Javanmard, and Balaji Prabhakar:
"Analysis of DCTCP: Stability, Convergence, and Fairness"
Proc. ACM SIGMETRICS, San Jose, 2011.
http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp_analysis-full.pdf
IETF informational draft:
http://tools.ietf.org/html/draft-bensley-tcpm-dctcp-00
DCTCP site:
http://simula.stanford.edu/~alizade/Site/DCTCP.html

View file

@ -0,0 +1,178 @@
Originally, this driver was written for the Digital Equipment
Corporation series of EtherWORKS Ethernet cards:
DE425 TP/COAX EISA
DE434 TP PCI
DE435 TP/COAX/AUI PCI
DE450 TP/COAX/AUI PCI
DE500 10/100 PCI Fasternet
but it will now attempt to support all cards which conform to the
Digital Semiconductor SROM Specification. The driver currently
recognises the following chips:
DC21040 (no SROM)
DC21041[A]
DC21140[A]
DC21142
DC21143
So far the driver is known to work with the following cards:
KINGSTON
Linksys
ZNYX342
SMC8432
SMC9332 (w/new SROM)
ZNYX31[45]
ZNYX346 10/100 4 port (can act as a 10/100 bridge!)
The driver has been tested on a relatively busy network using the DE425,
DE434, DE435 and DE500 cards and benchmarked with 'ttcp': it transferred
16M of data to a DECstation 5000/200 as follows:
TCP UDP
TX RX TX RX
DE425 1030k 997k 1170k 1128k
DE434 1063k 995k 1170k 1125k
DE435 1063k 995k 1170k 1125k
DE500 1063k 998k 1170k 1125k in 10Mb/s mode
All values are typical (in kBytes/sec) from a sample of 4 for each
measurement. Their error is +/-20k on a quiet (private) network and also
depend on what load the CPU has.
=========================================================================
The ability to load this driver as a loadable module has been included
and used extensively during the driver development (to save those long
reboot sequences). Loadable module support under PCI and EISA has been
achieved by letting the driver autoprobe as if it were compiled into the
kernel. Do make sure you're not sharing interrupts with anything that
cannot accommodate interrupt sharing!
To utilise this ability, you have to do 8 things:
0) have a copy of the loadable modules code installed on your system.
1) copy de4x5.c from the /linux/drivers/net directory to your favourite
temporary directory.
2) for fixed autoprobes (not recommended), edit the source code near
line 5594 to reflect the I/O address you're using, or assign these when
loading by:
insmod de4x5 io=0xghh where g = bus number
hh = device number
NB: autoprobing for modules is now supported by default. You may just
use:
insmod de4x5
to load all available boards. For a specific board, still use
the 'io=?' above.
3) compile de4x5.c, but include -DMODULE in the command line to ensure
that the correct bits are compiled (see end of source code).
4) if you are wanting to add a new card, goto 5. Otherwise, recompile a
kernel with the de4x5 configuration turned off and reboot.
5) insmod de4x5 [io=0xghh]
6) run the net startup bits for your new eth?? interface(s) manually
(usually /etc/rc.inet[12] at boot time).
7) enjoy!
To unload a module, turn off the associated interface(s)
'ifconfig eth?? down' then 'rmmod de4x5'.
Automedia detection is included so that in principle you can disconnect
from, e.g. TP, reconnect to BNC and things will still work (after a
pause whilst the driver figures out where its media went). My tests
using ping showed that it appears to work....
By default, the driver will now autodetect any DECchip based card.
Should you have a need to restrict the driver to DIGITAL only cards, you
can compile with a DEC_ONLY define, or if loading as a module, use the
'dec_only=1' parameter.
I've changed the timing routines to use the kernel timer and scheduling
functions so that the hangs and other assorted problems that occurred
while autosensing the media should be gone. A bonus for the DC21040
auto media sense algorithm is that it can now use one that is more in
line with the rest (the DC21040 chip doesn't have a hardware timer).
The downside is the 1 'jiffies' (10ms) resolution.
IEEE 802.3u MII interface code has been added in anticipation that some
products may use it in the future.
The SMC9332 card has a non-compliant SROM which needs fixing - I have
patched this driver to detect it because the SROM format used complies
to a previous DEC-STD format.
I have removed the buffer copies needed for receive on Intels. I cannot
remove them for Alphas since the Tulip hardware only does longword
aligned DMA transfers and the Alphas get alignment traps with non
longword aligned data copies (which makes them really slow). No comment.
I have added SROM decoding routines to make this driver work with any
card that supports the Digital Semiconductor SROM spec. This will help
all cards running the dc2114x series chips in particular. Cards using
the dc2104x chips should run correctly with the basic driver. I'm in
debt to <mjacob@feral.com> for the testing and feedback that helped get
this feature working. So far we have tested KINGSTON, SMC8432, SMC9332
(with the latest SROM complying with the SROM spec V3: their first was
broken), ZNYX342 and LinkSys. ZNYX314 (dual 21041 MAC) and ZNYX 315
(quad 21041 MAC) cards also appear to work despite their incorrectly
wired IRQs.
I have added a temporary fix for interrupt problems when some SCSI cards
share the same interrupt as the DECchip based cards. The problem occurs
because the SCSI card wants to grab the interrupt as a fast interrupt
(runs the service routine with interrupts turned off) vs. this card
which really needs to run the service routine with interrupts turned on.
This driver will now add the interrupt service routine as a fast
interrupt if it is bounced from the slow interrupt. THIS IS NOT A
RECOMMENDED WAY TO RUN THE DRIVER and has been done for a limited time
until people sort out their compatibility issues and the kernel
interrupt service code is fixed. YOU SHOULD SEPARATE OUT THE FAST
INTERRUPT CARDS FROM THE SLOW INTERRUPT CARDS to ensure that they do not
run on the same interrupt. PCMCIA/CardBus is another can of worms...
Finally, I think I have really fixed the module loading problem with
more than one DECchip based card. As a side effect, I don't mess with
the device structure any more which means that if more than 1 card in
2.0.x is installed (4 in 2.1.x), the user will have to edit
linux/drivers/net/Space.c to make room for them. Hence, module loading
is the preferred way to use this driver, since it doesn't have this
limitation.
Where SROM media detection is used and full duplex is specified in the
SROM, the feature is ignored unless lp->params.fdx is set at compile
time OR during a module load (insmod de4x5 args='eth??:fdx' [see
below]). This is because there is no way to automatically detect full
duplex links except through autonegotiation. When I include the
autonegotiation feature in the SROM autoconf code, this detection will
occur automatically for that case.
Command line arguments are now allowed, similar to passing arguments
through LILO. This will allow a per adapter board set up of full duplex
and media. The only lexical constraints are: the board name (dev->name)
appears in the list before its parameters. The list of parameters ends
either at the end of the parameter list or with another board name. The
following parameters are allowed:
fdx for full duplex
autosense to set the media/speed; with the following
sub-parameters:
TP, TP_NW, BNC, AUI, BNC_AUI, 100Mb, 10Mb, AUTO
Case sensitivity is important for the sub-parameters. They *must* be
upper case. Examples:
insmod de4x5 args='eth1:fdx autosense=BNC eth0:autosense=100Mb'.
For a compiled in driver, in linux/drivers/net/CONFIG, place e.g.
DE4X5_OPTS = -DDE4X5_PARM='"eth0:fdx autosense=AUI eth2:autosense=TP"'
Yes, I know full duplex isn't permissible on BNC or AUI; they're just
examples. By default, full duplex is turned off and AUTO is the default
autosense setting. In reality, I expect only the full duplex option to
be used. Note the use of single quotes in the two examples above and the
lack of commas to separate items.

View file

@ -0,0 +1,232 @@
Linux DECnet Networking Layer Information
===========================================
1) Other documentation....
o Project Home Pages
http://www.chygwyn.com/ - Kernel info
http://linux-decnet.sourceforge.net/ - Userland tools
http://www.sourceforge.net/projects/linux-decnet/ - Status page
2) Configuring the kernel
Be sure to turn on the following options:
CONFIG_DECNET (obviously)
CONFIG_PROC_FS (to see what's going on)
CONFIG_SYSCTL (for easy configuration)
if you want to try out router support (not properly debugged yet)
you'll need the following options as well...
CONFIG_DECNET_ROUTER (to be able to add/delete routes)
CONFIG_NETFILTER (will be required for the DECnet routing daemon)
CONFIG_DECNET_ROUTE_FWMARK is optional
Don't turn on SIOCGIFCONF support for DECnet unless you are really sure
that you need it, in general you won't and it can cause ifconfig to
malfunction.
Run time configuration has changed slightly from the 2.4 system. If you
want to configure an endnode, then the simplified procedure is as follows:
o Set the MAC address on your ethernet card before starting _any_ other
network protocols.
As soon as your network card is brought into the UP state, DECnet should
start working. If you need something more complicated or are unsure how
to set the MAC address, see the next section. Also all configurations which
worked with 2.4 will work under 2.5 with no change.
3) Command line options
You can set a DECnet address on the kernel command line for compatibility
with the 2.4 configuration procedure, but in general it's not needed any more.
If you do st a DECnet address on the command line, it has only one purpose
which is that its added to the addresses on the loopback device.
With 2.4 kernels, DECnet would only recognise addresses as local if they
were added to the loopback device. In 2.5, any local interface address
can be used to loop back to the local machine. Of course this does not
prevent you adding further addresses to the loopback device if you
want to.
N.B. Since the address list of an interface determines the addresses for
which "hello" messages are sent, if you don't set an address on the loopback
interface then you won't see any entries in /proc/net/neigh for the local
host until such time as you start a connection. This doesn't affect the
operation of the local communications in any other way though.
The kernel command line takes options looking like the following:
decnet.addr=1,2
the two numbers are the node address 1,2 = 1.2 For 2.2.xx kernels
and early 2.3.xx kernels, you must use a comma when specifying the
DECnet address like this. For more recent 2.3.xx kernels, you may
use almost any character except space, although a `.` would be the most
obvious choice :-)
There used to be a third number specifying the node type. This option
has gone away in favour of a per interface node type. This is now set
using /proc/sys/net/decnet/conf/<dev>/forwarding. This file can be
set with a single digit, 0=EndNode, 1=L1 Router and 2=L2 Router.
There are also equivalent options for modules. The node address can
also be set through the /proc/sys/net/decnet/ files, as can other system
parameters.
Currently the only supported devices are ethernet and ip_gre. The
ethernet address of your ethernet card has to be set according to the DECnet
address of the node in order for it to be autoconfigured (and then appear in
/proc/net/decnet_dev). There is a utility available at the above
FTP sites called dn2ethaddr which can compute the correct ethernet
address to use. The address can be set by ifconfig either before or
at the time the device is brought up. If you are using RedHat you can
add the line:
MACADDR=AA:00:04:00:03:04
or something similar, to /etc/sysconfig/network-scripts/ifcfg-eth0 or
wherever your network card's configuration lives. Setting the MAC address
of your ethernet card to an address starting with "hi-ord" will cause a
DECnet address which matches to be added to the interface (which you can
verify with iproute2).
The default device for routing can be set through the /proc filesystem
by setting /proc/sys/net/decnet/default_device to the
device you want DECnet to route packets out of when no specific route
is available. Usually this will be eth0, for example:
echo -n "eth0" >/proc/sys/net/decnet/default_device
If you don't set the default device, then it will default to the first
ethernet card which has been autoconfigured as described above. You can
confirm that by looking in the default_device file of course.
There is a list of what the other files under /proc/sys/net/decnet/ do
on the kernel patch web site (shown above).
4) Run time kernel configuration
This is either done through the sysctl/proc interface (see the kernel web
pages for details on what the various options do) or through the iproute2
package in the same way as IPv4/6 configuration is performed.
Documentation for iproute2 is included with the package, although there is
as yet no specific section on DECnet, most of the features apply to both
IP and DECnet, albeit with DECnet addresses instead of IP addresses and
a reduced functionality.
If you want to configure a DECnet router you'll need the iproute2 package
since its the _only_ way to add and delete routes currently. Eventually
there will be a routing daemon to send and receive routing messages for
each interface and update the kernel routing tables accordingly. The
routing daemon will use netfilter to listen to routing packets, and
rtnetlink to update the kernels routing tables.
The DECnet raw socket layer has been removed since it was there purely
for use by the routing daemon which will now use netfilter (a much cleaner
and more generic solution) instead.
5) How can I tell if its working ?
Here is a quick guide of what to look for in order to know if your DECnet
kernel subsystem is working.
- Is the node address set (see /proc/sys/net/decnet/node_address)
- Is the node of the correct type
(see /proc/sys/net/decnet/conf/<dev>/forwarding)
- Is the Ethernet MAC address of each Ethernet card set to match
the DECnet address. If in doubt use the dn2ethaddr utility available
at the ftp archive.
- If the previous two steps are satisfied, and the Ethernet card is up,
you should find that it is listed in /proc/net/decnet_dev and also
that it appears as a directory in /proc/sys/net/decnet/conf/. The
loopback device (lo) should also appear and is required to communicate
within a node.
- If you have any DECnet routers on your network, they should appear
in /proc/net/decnet_neigh, otherwise this file will only contain the
entry for the node itself (if it doesn't check to see if lo is up).
- If you want to send to any node which is not listed in the
/proc/net/decnet_neigh file, you'll need to set the default device
to point to an Ethernet card with connection to a router. This is
again done with the /proc/sys/net/decnet/default_device file.
- Try starting a simple server and client, like the dnping/dnmirror
over the loopback interface. With luck they should communicate.
For this step and those after, you'll need the DECnet library
which can be obtained from the above ftp sites as well as the
actual utilities themselves.
- If this seems to work, then try talking to a node on your local
network, and see if you can obtain the same results.
- At this point you are on your own... :-)
6) How to send a bug report
If you've found a bug and want to report it, then there are several things
you can do to help me work out exactly what it is that is wrong. Useful
information (_most_ of which _is_ _essential_) includes:
- What kernel version are you running ?
- What version of the patch are you running ?
- How far though the above set of tests can you get ?
- What is in the /proc/decnet* files and /proc/sys/net/decnet/* files ?
- Which services are you running ?
- Which client caused the problem ?
- How much data was being transferred ?
- Was the network congested ?
- How can the problem be reproduced ?
- Can you use tcpdump to get a trace ? (N.B. Most (all?) versions of
tcpdump don't understand how to dump DECnet properly, so including
the hex listing of the packet contents is _essential_, usually the -x flag.
You may also need to increase the length grabbed with the -s flag. The
-e flag also provides very useful information (ethernet MAC addresses))
7) MAC FAQ
A quick FAQ on ethernet MAC addresses to explain how Linux and DECnet
interact and how to get the best performance from your hardware.
Ethernet cards are designed to normally only pass received network frames
to a host computer when they are addressed to it, or to the broadcast address.
Linux has an interface which allows the setting of extra addresses for
an ethernet card to listen to. If the ethernet card supports it, the
filtering operation will be done in hardware, if not the extra unwanted packets
received will be discarded by the host computer. In the latter case,
significant processor time and bus bandwidth can be used up on a busy
network (see the NAPI documentation for a longer explanation of these
effects).
DECnet makes use of this interface to allow running DECnet on an ethernet
card which has already been configured using TCP/IP (presumably using the
built in MAC address of the card, as usual) and/or to allow multiple DECnet
addresses on each physical interface. If you do this, be aware that if your
ethernet card doesn't support perfect hashing in its MAC address filter
then your computer will be doing more work than required. Some cards
will simply set themselves into promiscuous mode in order to receive
packets from the DECnet specified addresses. So if you have one of these
cards its better to set the MAC address of the card as described above
to gain the best efficiency. Better still is to use a card which supports
NAPI as well.
8) Mailing list
If you are keen to get involved in development, or want to ask questions
about configuration, or even just report bugs, then there is a mailing
list that you can join, details are at:
http://sourceforge.net/mail/?group_id=4993
9) Legal Info
The Linux DECnet project team have placed their code under the GPL. The
software is provided "as is" and without warranty express or implied.
DECnet is a trademark of Compaq. This software is not a product of
Compaq. We acknowledge the help of people at Compaq in providing extra
documentation above and beyond what was previously publicly available.
Steve Whitehouse <SteveW@ACM.org>

View file

@ -0,0 +1,282 @@
D-Link DL2000-based Gigabit Ethernet Adapter Installation
for Linux
May 23, 2002
Contents
========
- Compatibility List
- Quick Install
- Compiling the Driver
- Installing the Driver
- Option parameter
- Configuration Script Sample
- Troubleshooting
Compatibility List
=================
Adapter Support:
D-Link DGE-550T Gigabit Ethernet Adapter.
D-Link DGE-550SX Gigabit Ethernet Adapter.
D-Link DL2000-based Gigabit Ethernet Adapter.
The driver support Linux kernel 2.4.7 later. We had tested it
on the environments below.
. Red Hat v6.2 (update kernel to 2.4.7)
. Red Hat v7.0 (update kernel to 2.4.7)
. Red Hat v7.1 (kernel 2.4.7)
. Red Hat v7.2 (kernel 2.4.7-10)
Quick Install
=============
Install linux driver as following command:
1. make all
2. insmod dl2k.ko
3. ifconfig eth0 up 10.xxx.xxx.xxx netmask 255.0.0.0
^^^^^^^^^^^^^^^\ ^^^^^^^^\
IP NETMASK
Now eth0 should active, you can test it by "ping" or get more information by
"ifconfig". If tested ok, continue the next step.
4. cp dl2k.ko /lib/modules/`uname -r`/kernel/drivers/net
5. Add the following line to /etc/modprobe.d/dl2k.conf:
alias eth0 dl2k
6. Run depmod to updated module indexes.
7. Run "netconfig" or "netconf" to create configuration script ifcfg-eth0
located at /etc/sysconfig/network-scripts or create it manually.
[see - Configuration Script Sample]
8. Driver will automatically load and configure at next boot time.
Compiling the Driver
====================
In Linux, NIC drivers are most commonly configured as loadable modules.
The approach of building a monolithic kernel has become obsolete. The driver
can be compiled as part of a monolithic kernel, but is strongly discouraged.
The remainder of this section assumes the driver is built as a loadable module.
In the Linux environment, it is a good idea to rebuild the driver from the
source instead of relying on a precompiled version. This approach provides
better reliability since a precompiled driver might depend on libraries or
kernel features that are not present in a given Linux installation.
The 3 files necessary to build Linux device driver are dl2k.c, dl2k.h and
Makefile. To compile, the Linux installation must include the gcc compiler,
the kernel source, and the kernel headers. The Linux driver supports Linux
Kernels 2.4.7. Copy the files to a directory and enter the following command
to compile and link the driver:
CD-ROM drive
------------
[root@XXX /] mkdir cdrom
[root@XXX /] mount -r -t iso9660 -o conv=auto /dev/cdrom /cdrom
[root@XXX /] cd root
[root@XXX /root] mkdir dl2k
[root@XXX /root] cd dl2k
[root@XXX dl2k] cp /cdrom/linux/dl2k.tgz /root/dl2k
[root@XXX dl2k] tar xfvz dl2k.tgz
[root@XXX dl2k] make all
Floppy disc drive
-----------------
[root@XXX /] cd root
[root@XXX /root] mkdir dl2k
[root@XXX /root] cd dl2k
[root@XXX dl2k] mcopy a:/linux/dl2k.tgz /root/dl2k
[root@XXX dl2k] tar xfvz dl2k.tgz
[root@XXX dl2k] make all
Installing the Driver
=====================
Manual Installation
-------------------
Once the driver has been compiled, it must be loaded, enabled, and bound
to a protocol stack in order to establish network connectivity. To load a
module enter the command:
insmod dl2k.o
or
insmod dl2k.o <optional parameter> ; add parameter
===============================================================
example: insmod dl2k.o media=100mbps_hd
or insmod dl2k.o media=3
or insmod dl2k.o media=3,2 ; for 2 cards
===============================================================
Please reference the list of the command line parameters supported by
the Linux device driver below.
The insmod command only loads the driver and gives it a name of the form
eth0, eth1, etc. To bring the NIC into an operational state,
it is necessary to issue the following command:
ifconfig eth0 up
Finally, to bind the driver to the active protocol (e.g., TCP/IP with
Linux), enter the following command:
ifup eth0
Note that this is meaningful only if the system can find a configuration
script that contains the necessary network information. A sample will be
given in the next paragraph.
The commands to unload a driver are as follows:
ifdown eth0
ifconfig eth0 down
rmmod dl2k.o
The following are the commands to list the currently loaded modules and
to see the current network configuration.
lsmod
ifconfig
Automated Installation
----------------------
This section describes how to install the driver such that it is
automatically loaded and configured at boot time. The following description
is based on a Red Hat 6.0/7.0 distribution, but it can easily be ported to
other distributions as well.
Red Hat v6.x/v7.x
-----------------
1. Copy dl2k.o to the network modules directory, typically
/lib/modules/2.x.x-xx/net or /lib/modules/2.x.x/kernel/drivers/net.
2. Locate the boot module configuration file, most commonly in the
/etc/modprobe.d/ directory. Add the following lines:
alias ethx dl2k
options dl2k <optional parameters>
where ethx will be eth0 if the NIC is the only ethernet adapter, eth1 if
one other ethernet adapter is installed, etc. Refer to the table in the
previous section for the list of optional parameters.
3. Locate the network configuration scripts, normally the
/etc/sysconfig/network-scripts directory, and create a configuration
script named ifcfg-ethx that contains network information.
4. Note that for most Linux distributions, Red Hat included, a configuration
utility with a graphical user interface is provided to perform steps 2
and 3 above.
Parameter Description
=====================
You can install this driver without any additional parameter. However, if you
are going to have extensive functions then it is necessary to set extra
parameter. Below is a list of the command line parameters supported by the
Linux device
driver.
mtu=packet_size - Specifies the maximum packet size. default
is 1500.
media=media_type - Specifies the media type the NIC operates at.
autosense Autosensing active media.
10mbps_hd 10Mbps half duplex.
10mbps_fd 10Mbps full duplex.
100mbps_hd 100Mbps half duplex.
100mbps_fd 100Mbps full duplex.
1000mbps_fd 1000Mbps full duplex.
1000mbps_hd 1000Mbps half duplex.
0 Autosensing active media.
1 10Mbps half duplex.
2 10Mbps full duplex.
3 100Mbps half duplex.
4 100Mbps full duplex.
5 1000Mbps half duplex.
6 1000Mbps full duplex.
By default, the NIC operates at autosense.
1000mbps_fd and 1000mbps_hd types are only
available for fiber adapter.
vlan=n - Specifies the VLAN ID. If vlan=0, the
Virtual Local Area Network (VLAN) function is
disable.
jumbo=[0|1] - Specifies the jumbo frame support. If jumbo=1,
the NIC accept jumbo frames. By default, this
function is disabled.
Jumbo frame usually improve the performance
int gigabit.
This feature need jumbo frame compatible
remote.
rx_coalesce=m - Number of rx frame handled each interrupt.
rx_timeout=n - Rx DMA wait time for an interrupt.
If set rx_coalesce > 0, hardware only assert
an interrupt for m frames. Hardware won't
assert rx interrupt until m frames received or
reach timeout of n * 640 nano seconds.
Set proper rx_coalesce and rx_timeout can
reduce congestion collapse and overload which
has been a bottleneck for high speed network.
For example, rx_coalesce=10 rx_timeout=800.
that is, hardware assert only 1 interrupt
for 10 frames received or timeout of 512 us.
tx_coalesce=n - Number of tx frame handled each interrupt.
Set n > 1 can reduce the interrupts
congestion usually lower performance of
high speed network card. Default is 16.
tx_flow=[1|0] - Specifies the Tx flow control. If tx_flow=0,
the Tx flow control disable else driver
autodetect.
rx_flow=[1|0] - Specifies the Rx flow control. If rx_flow=0,
the Rx flow control enable else driver
autodetect.
Configuration Script Sample
===========================
Here is a sample of a simple configuration script:
DEVICE=eth0
USERCTL=no
ONBOOT=yes
POOTPROTO=none
BROADCAST=207.200.5.255
NETWORK=207.200.5.0
NETMASK=255.255.255.0
IPADDR=207.200.5.2
Troubleshooting
===============
Q1. Source files contain ^ M behind every line.
Make sure all files are Unix file format (no LF). Try the following
shell command to convert files.
cat dl2k.c | col -b > dl2k.tmp
mv dl2k.tmp dl2k.c
OR
cat dl2k.c | tr -d "\r" > dl2k.tmp
mv dl2k.tmp dl2k.c
Q2: Could not find header files (*.h) ?
To compile the driver, you need kernel header files. After
installing the kernel source, the header files are usually located in
/usr/src/linux/include, which is the default include directory configured
in Makefile. For some distributions, there is a copy of header files in
/usr/src/include/linux and /usr/src/include/asm, that you can change the
INCLUDEDIR in Makefile to /usr/include without installing kernel source.
Note that RH 7.0 didn't provide correct header files in /usr/include,
including those files will make a wrong version driver.

View file

@ -0,0 +1,167 @@
DM9000 Network driver
=====================
Copyright 2008 Simtec Electronics,
Ben Dooks <ben@simtec.co.uk> <ben-linux@fluff.org>
Introduction
------------
This file describes how to use the DM9000 platform-device based network driver
that is contained in the files drivers/net/dm9000.c and drivers/net/dm9000.h.
The driver supports three DM9000 variants, the DM9000E which is the first chip
supported as well as the newer DM9000A and DM9000B devices. It is currently
maintained and tested by Ben Dooks, who should be CC: to any patches for this
driver.
Defining the platform device
----------------------------
The minimum set of resources attached to the platform device are as follows:
1) The physical address of the address register
2) The physical address of the data register
3) The IRQ line the device's interrupt pin is connected to.
These resources should be specified in that order, as the ordering of the
two address regions is important (the driver expects these to be address
and then data).
An example from arch/arm/mach-s3c2410/mach-bast.c is:
static struct resource bast_dm9k_resource[] = {
[0] = {
.start = S3C2410_CS5 + BAST_PA_DM9000,
.end = S3C2410_CS5 + BAST_PA_DM9000 + 3,
.flags = IORESOURCE_MEM,
},
[1] = {
.start = S3C2410_CS5 + BAST_PA_DM9000 + 0x40,
.end = S3C2410_CS5 + BAST_PA_DM9000 + 0x40 + 0x3f,
.flags = IORESOURCE_MEM,
},
[2] = {
.start = IRQ_DM9000,
.end = IRQ_DM9000,
.flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHLEVEL,
}
};
static struct platform_device bast_device_dm9k = {
.name = "dm9000",
.id = 0,
.num_resources = ARRAY_SIZE(bast_dm9k_resource),
.resource = bast_dm9k_resource,
};
Note the setting of the IRQ trigger flag in bast_dm9k_resource[2].flags,
as this will generate a warning if it is not present. The trigger from
the flags field will be passed to request_irq() when registering the IRQ
handler to ensure that the IRQ is setup correctly.
This shows a typical platform device, without the optional configuration
platform data supplied. The next example uses the same resources, but adds
the optional platform data to pass extra configuration data:
static struct dm9000_plat_data bast_dm9k_platdata = {
.flags = DM9000_PLATF_16BITONLY,
};
static struct platform_device bast_device_dm9k = {
.name = "dm9000",
.id = 0,
.num_resources = ARRAY_SIZE(bast_dm9k_resource),
.resource = bast_dm9k_resource,
.dev = {
.platform_data = &bast_dm9k_platdata,
}
};
The platform data is defined in include/linux/dm9000.h and described below.
Platform data
-------------
Extra platform data for the DM9000 can describe the IO bus width to the
device, whether or not an external PHY is attached to the device and
the availability of an external configuration EEPROM.
The flags for the platform data .flags field are as follows:
DM9000_PLATF_8BITONLY
The IO should be done with 8bit operations.
DM9000_PLATF_16BITONLY
The IO should be done with 16bit operations.
DM9000_PLATF_32BITONLY
The IO should be done with 32bit operations.
DM9000_PLATF_EXT_PHY
The chip is connected to an external PHY.
DM9000_PLATF_NO_EEPROM
This can be used to signify that the board does not have an
EEPROM, or that the EEPROM should be hidden from the user.
DM9000_PLATF_SIMPLE_PHY
Switch to using the simpler PHY polling method which does not
try and read the MII PHY state regularly. This is only available
when using the internal PHY. See the section on link state polling
for more information.
The config symbol DM9000_FORCE_SIMPLE_PHY_POLL, Kconfig entry
"Force simple NSR based PHY polling" allows this flag to be
forced on at build time.
PHY Link state polling
----------------------
The driver keeps track of the link state and informs the network core
about link (carrier) availability. This is managed by several methods
depending on the version of the chip and on which PHY is being used.
For the internal PHY, the original (and currently default) method is
to read the MII state, either when the status changes if we have the
necessary interrupt support in the chip or every two seconds via a
periodic timer.
To reduce the overhead for the internal PHY, there is now the option
of using the DM9000_FORCE_SIMPLE_PHY_POLL config, or DM9000_PLATF_SIMPLE_PHY
platform data option to read the summary information without the
expensive MII accesses. This method is faster, but does not print
as much information.
When using an external PHY, the driver currently has to poll the MII
link status as there is no method for getting an interrupt on link change.
DM9000A / DM9000B
-----------------
These chips are functionally similar to the DM9000E and are supported easily
by the same driver. The features are:
1) Interrupt on internal PHY state change. This means that the periodic
polling of the PHY status may be disabled on these devices when using
the internal PHY.
2) TCP/UDP checksum offloading, which the driver does not currently support.
ethtool
-------
The driver supports the ethtool interface for access to the driver
state information, the PHY state and the EEPROM.

View file

@ -0,0 +1,66 @@
Note: This driver doesn't have a maintainer.
Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux.
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
This driver provides kernel support for Davicom DM9102(A)/DM9132/DM9801 ethernet cards ( CNET
10/100 ethernet cards uses Davicom chipset too, so this driver supports CNET cards too ).If you
didn't compile this driver as a module, it will automatically load itself on boot and print a
line similar to :
dmfe: Davicom DM9xxx net driver, version 1.36.4 (2002-01-17)
If you compiled this driver as a module, you have to load it on boot.You can load it with command :
insmod dmfe
This way it will autodetect the device mode.This is the suggested way to load the module.Or you can pass
a mode= setting to module while loading, like :
insmod dmfe mode=0 # Force 10M Half Duplex
insmod dmfe mode=1 # Force 100M Half Duplex
insmod dmfe mode=4 # Force 10M Full Duplex
insmod dmfe mode=5 # Force 100M Full Duplex
Next you should configure your network interface with a command similar to :
ifconfig eth0 172.22.3.18
^^^^^^^^^^^
Your IP Address
Then you may have to modify the default routing table with command :
route add default eth0
Now your ethernet card should be up and running.
TODO:
Implement pci_driver::suspend() and pci_driver::resume() power management methods.
Check on 64 bit boxes.
Check and fix on big endian boxes.
Test and make sure PCI latency is now correct for all cases.
Authors:
Sten Wang <sten_wang@davicom.com.tw > : Original Author
Contributors:
Marcelo Tosatti <marcelo@conectiva.com.br>
Alan Cox <alan@lxorguk.ukuu.org.uk>
Jeff Garzik <jgarzik@pobox.com>
Vojtech Pavlik <vojtech@suse.cz>

View file

@ -0,0 +1,157 @@
===================
DNS Resolver Module
===================
Contents:
- Overview.
- Compilation.
- Setting up.
- Usage.
- Mechanism.
- Debugging.
========
OVERVIEW
========
The DNS resolver module provides a way for kernel services to make DNS queries
by way of requesting a key of key type dns_resolver. These queries are
upcalled to userspace through /sbin/request-key.
These routines must be supported by userspace tools dns.upcall, cifs.upcall and
request-key. It is under development and does not yet provide the full feature
set. The features it does support include:
(*) Implements the dns_resolver key_type to contact userspace.
It does not yet support the following AFS features:
(*) Dns query support for AFSDB resource record.
This code is extracted from the CIFS filesystem.
===========
COMPILATION
===========
The module should be enabled by turning on the kernel configuration options:
CONFIG_DNS_RESOLVER - tristate "DNS Resolver support"
==========
SETTING UP
==========
To set up this facility, the /etc/request-key.conf file must be altered so that
/sbin/request-key can appropriately direct the upcalls. For example, to handle
basic dname to IPv4/IPv6 address resolution, the following line should be
added:
#OP TYPE DESC CO-INFO PROGRAM ARG1 ARG2 ARG3 ...
#====== ============ ======= ======= ==========================
create dns_resolver * * /usr/sbin/cifs.upcall %k
To direct a query for query type 'foo', a line of the following should be added
before the more general line given above as the first match is the one taken.
create dns_resolver foo:* * /usr/sbin/dns.foo %k
=====
USAGE
=====
To make use of this facility, one of the following functions that are
implemented in the module can be called after doing:
#include <linux/dns_resolver.h>
(1) int dns_query(const char *type, const char *name, size_t namelen,
const char *options, char **_result, time_t *_expiry);
This is the basic access function. It looks for a cached DNS query and if
it doesn't find it, it upcalls to userspace to make a new DNS query, which
may then be cached. The key description is constructed as a string of the
form:
[<type>:]<name>
where <type> optionally specifies the particular upcall program to invoke,
and thus the type of query to do, and <name> specifies the string to be
looked up. The default query type is a straight hostname to IP address
set lookup.
The name parameter is not required to be a NUL-terminated string, and its
length should be given by the namelen argument.
The options parameter may be NULL or it may be a set of options
appropriate to the query type.
The return value is a string appropriate to the query type. For instance,
for the default query type it is just a list of comma-separated IPv4 and
IPv6 addresses. The caller must free the result.
The length of the result string is returned on success, and a negative
error code is returned otherwise. -EKEYREJECTED will be returned if the
DNS lookup failed.
If _expiry is non-NULL, the expiry time (TTL) of the result will be
returned also.
The kernel maintains an internal keyring in which it caches looked up keys.
This can be cleared by any process that has the CAP_SYS_ADMIN capability by
the use of KEYCTL_KEYRING_CLEAR on the keyring ID.
===============================
READING DNS KEYS FROM USERSPACE
===============================
Keys of dns_resolver type can be read from userspace using keyctl_read() or
"keyctl read/print/pipe".
=========
MECHANISM
=========
The dnsresolver module registers a key type called "dns_resolver". Keys of
this type are used to transport and cache DNS lookup results from userspace.
When dns_query() is invoked, it calls request_key() to search the local
keyrings for a cached DNS result. If that fails to find one, it upcalls to
userspace to get a new result.
Upcalls to userspace are made through the request_key() upcall vector, and are
directed by means of configuration lines in /etc/request-key.conf that tell
/sbin/request-key what program to run to instantiate the key.
The upcall handler program is responsible for querying the DNS, processing the
result into a form suitable for passing to the keyctl_instantiate_key()
routine. This then passes the data to dns_resolver_instantiate() which strips
off and processes any options included in the data, and then attaches the
remainder of the string to the key as its payload.
The upcall handler program should set the expiry time on the key to that of the
lowest TTL of all the records it has extracted a result from. This means that
the key will be discarded and recreated when the data it holds has expired.
dns_query() returns a copy of the value attached to the key, or an error if
that is indicated instead.
See <file:Documentation/security/keys-request-key.txt> for further
information about request-key function.
=========
DEBUGGING
=========
Debugging messages can be turned on dynamically by writing a 1 into the
following file:
/sys/module/dnsresolver/parameters/debug

View file

@ -0,0 +1,93 @@
Document about softnet driver issues
Transmit path guidelines:
1) The ndo_start_xmit method must not return NETDEV_TX_BUSY under
any normal circumstances. It is considered a hard error unless
there is no way your device can tell ahead of time when it's
transmit function will become busy.
Instead it must maintain the queue properly. For example,
for a driver implementing scatter-gather this means:
static netdev_tx_t drv_hard_start_xmit(struct sk_buff *skb,
struct net_device *dev)
{
struct drv *dp = netdev_priv(dev);
lock_tx(dp);
...
/* This is a hard error log it. */
if (TX_BUFFS_AVAIL(dp) <= (skb_shinfo(skb)->nr_frags + 1)) {
netif_stop_queue(dev);
unlock_tx(dp);
printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n",
dev->name);
return NETDEV_TX_BUSY;
}
... queue packet to card ...
... update tx consumer index ...
if (TX_BUFFS_AVAIL(dp) <= (MAX_SKB_FRAGS + 1))
netif_stop_queue(dev);
...
unlock_tx(dp);
...
return NETDEV_TX_OK;
}
And then at the end of your TX reclamation event handling:
if (netif_queue_stopped(dp->dev) &&
TX_BUFFS_AVAIL(dp) > (MAX_SKB_FRAGS + 1))
netif_wake_queue(dp->dev);
For a non-scatter-gather supporting card, the three tests simply become:
/* This is a hard error log it. */
if (TX_BUFFS_AVAIL(dp) <= 0)
and:
if (TX_BUFFS_AVAIL(dp) == 0)
and:
if (netif_queue_stopped(dp->dev) &&
TX_BUFFS_AVAIL(dp) > 0)
netif_wake_queue(dp->dev);
2) An ndo_start_xmit method must not modify the shared parts of a
cloned SKB.
3) Do not forget that once you return NETDEV_TX_OK from your
ndo_start_xmit method, it is your driver's responsibility to free
up the SKB and in some finite amount of time.
For example, this means that it is not allowed for your TX
mitigation scheme to let TX packets "hang out" in the TX
ring unreclaimed forever if no new TX packets are sent.
This error can deadlock sockets waiting for send buffer room
to be freed up.
If you return NETDEV_TX_BUSY from the ndo_start_xmit method, you
must not keep any reference to that SKB and you must not attempt
to free it up.
Probing guidelines:
1) Any hardware layer address you obtain for your device should
be verified. For example, for ethernet check it with
linux/etherdevice.h:is_valid_ether_addr()
Close/stop guidelines:
1) After the ndo_stop routine has been called, the hardware must
not receive or transmit any data. All in flight packets must
be aborted. If necessary, poll or wait for completion of
any reset commands.
2) The ndo_stop routine will be called by unregister_netdevice
if device is still UP.

View file

@ -0,0 +1,197 @@
Linux* Base Driver for the Intel(R) PRO/100 Family of Adapters
==============================================================
March 15, 2011
Contents
========
- In This Release
- Identifying Your Adapter
- Building and Installation
- Driver Configuration Parameters
- Additional Configurations
- Known Issues
- Support
In This Release
===============
This file describes the Linux* Base Driver for the Intel(R) PRO/100 Family of
Adapters. This driver includes support for Itanium(R)2-based systems.
For questions related to hardware requirements, refer to the documentation
supplied with your Intel PRO/100 adapter.
The following features are now available in supported kernels:
- Native VLANs
- Channel Bonding (teaming)
- SNMP
Channel Bonding documentation can be found in the Linux kernel source:
/Documentation/networking/bonding.txt
Identifying Your Adapter
========================
For more information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/network/adapter/pro100/21397.htm
For the latest Intel network drivers for Linux, refer to the following
website. In the search field, enter your adapter name or type, or use the
networking link on the left to search for your adapter:
http://downloadfinder.intel.com/scripts-df/support_intel.asp
Driver Configuration Parameters
===============================
The default value for each parameter is generally the recommended setting,
unless otherwise noted.
Rx Descriptors: Number of receive descriptors. A receive descriptor is a data
structure that describes a receive buffer and its attributes to the network
controller. The data in the descriptor is used by the controller to write
data from the controller to host memory. In the 3.x.x driver the valid range
for this parameter is 64-256. The default value is 64. This parameter can be
changed using the command:
ethtool -G eth? rx n, where n is the number of desired rx descriptors.
Tx Descriptors: Number of transmit descriptors. A transmit descriptor is a data
structure that describes a transmit buffer and its attributes to the network
controller. The data in the descriptor is used by the controller to read
data from the host memory to the controller. In the 3.x.x driver the valid
range for this parameter is 64-256. The default value is 64. This parameter
can be changed using the command:
ethtool -G eth? tx n, where n is the number of desired tx descriptors.
Speed/Duplex: The driver auto-negotiates the link speed and duplex settings by
default. The ethtool utility can be used as follows to force speed/duplex.
ethtool -s eth? autoneg off speed {10|100} duplex {full|half}
NOTE: setting the speed/duplex to incorrect values will cause the link to
fail.
Event Log Message Level: The driver uses the message level flag to log events
to syslog. The message level can be set at driver load time. It can also be
set using the command:
ethtool -s eth? msglvl n
Additional Configurations
=========================
Configuring the Driver on Different Distributions
-------------------------------------------------
Configuring a network driver to load properly when the system is started is
distribution dependent. Typically, the configuration process involves adding
an alias line to /etc/modprobe.d/*.conf as well as editing other system
startup scripts and/or configuration files. Many popular Linux
distributions ship with tools to make these changes for you. To learn the
proper way to configure a network device for your system, refer to your
distribution documentation. If during this process you are asked for the
driver or module name, the name for the Linux Base Driver for the Intel
PRO/100 Family of Adapters is e100.
As an example, if you install the e100 driver for two PRO/100 adapters
(eth0 and eth1), add the following to a configuration file in /etc/modprobe.d/
alias eth0 e100
alias eth1 e100
Viewing Link Messages
---------------------
In order to see link messages and other Intel driver information on your
console, you must set the dmesg level up to six. This can be done by
entering the following on the command line before loading the e100 driver:
dmesg -n 8
If you wish to see all messages issued by the driver, including debug
messages, set the dmesg level to eight.
NOTE: This setting is not saved across reboots.
ethtool
-------
The driver utilizes the ethtool interface for driver configuration and
diagnostics, as well as displaying statistical information. The ethtool
version 1.6 or later is required for this functionality.
The latest release of ethtool can be found from
http://ftp.kernel.org/pub/software/network/ethtool/
Enabling Wake on LAN* (WoL)
---------------------------
WoL is provided through the ethtool* utility. For instructions on enabling
WoL with ethtool, refer to the ethtool man page.
WoL will be enabled on the system during the next shut down or reboot. For
this driver version, in order to enable WoL, the e100 driver must be
loaded when shutting down or rebooting the system.
NAPI
----
NAPI (Rx polling mode) is supported in the e100 driver.
See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI.
Multiple Interfaces on Same Ethernet Broadcast Network
------------------------------------------------------
Due to the default ARP behavior on Linux, it is not possible to have
one system on two IP networks in the same Ethernet broadcast domain
(non-partitioned switch) behave as expected. All Ethernet interfaces
will respond to IP traffic for any IP address assigned to the system.
This results in unbalanced receive traffic.
If you have multiple interfaces in a server, either turn on ARP
filtering by
(1) entering: echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
(this only works if your kernel's version is higher than 2.4.5), or
(2) installing the interfaces in separate broadcast domains (either
in different switches or in a switch partitioned to VLANs).
Support
=======
For general information, go to the Intel support website at:
http://support.intel.com
or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related to the
issue to e1000-devel@lists.sourceforge.net.
License
=======
This software program is released under the terms of a license agreement
between you ('Licensee') and Intel. Do not use or load this software or any
associated materials (collectively, the 'Software') until you have carefully
read the full terms and conditions of the file COPYING located in this software
package. By loading or using the Software, you agree to the terms of this
Agreement. If you do not agree with the terms of this Agreement, do not install
or use the Software.
* Other names and brands may be claimed as the property of others.

View file

@ -0,0 +1,461 @@
Linux* Base Driver for Intel(R) Ethernet Network Connection
===========================================================
Intel Gigabit Linux driver.
Copyright(c) 1999 - 2013 Intel Corporation.
Contents
========
- Identifying Your Adapter
- Command Line Parameters
- Speed and Duplex Configuration
- Additional Configurations
- Support
Identifying Your Adapter
========================
For more information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/go/network/adapter/idguide.htm
For the latest Intel network drivers for Linux, refer to the following
website. In the search field, enter your adapter name or type, or use the
networking link on the left to search for your adapter:
http://support.intel.com/support/go/network/adapter/home.htm
Command Line Parameters
=======================
The default value for each parameter is generally the recommended setting,
unless otherwise noted.
NOTES: For more information about the AutoNeg, Duplex, and Speed
parameters, see the "Speed and Duplex Configuration" section in
this document.
For more information about the InterruptThrottleRate,
RxIntDelay, TxIntDelay, RxAbsIntDelay, and TxAbsIntDelay
parameters, see the application note at:
http://www.intel.com/design/network/applnots/ap450.htm
AutoNeg
-------
(Supported only on adapters with copper connections)
Valid Range: 0x01-0x0F, 0x20-0x2F
Default Value: 0x2F
This parameter is a bit-mask that specifies the speed and duplex settings
advertised by the adapter. When this parameter is used, the Speed and
Duplex parameters must not be specified.
NOTE: Refer to the Speed and Duplex section of this readme for more
information on the AutoNeg parameter.
Duplex
------
(Supported only on adapters with copper connections)
Valid Range: 0-2 (0=auto-negotiate, 1=half, 2=full)
Default Value: 0
This defines the direction in which data is allowed to flow. Can be
either one or two-directional. If both Duplex and the link partner are
set to auto-negotiate, the board auto-detects the correct duplex. If the
link partner is forced (either full or half), Duplex defaults to half-
duplex.
FlowControl
-----------
Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx)
Default Value: Reads flow control settings from the EEPROM
This parameter controls the automatic generation(Tx) and response(Rx)
to Ethernet PAUSE frames.
InterruptThrottleRate
---------------------
(not supported on Intel(R) 82542, 82543 or 82544-based adapters)
Valid Range: 0,1,3,4,100-100000 (0=off, 1=dynamic, 3=dynamic conservative,
4=simplified balancing)
Default Value: 3
The driver can limit the amount of interrupts per second that the adapter
will generate for incoming packets. It does this by writing a value to the
adapter that is based on the maximum amount of interrupts that the adapter
will generate per second.
Setting InterruptThrottleRate to a value greater or equal to 100
will program the adapter to send out a maximum of that many interrupts
per second, even if more packets have come in. This reduces interrupt
load on the system and can lower CPU utilization under heavy load,
but will increase latency as packets are not processed as quickly.
The default behaviour of the driver previously assumed a static
InterruptThrottleRate value of 8000, providing a good fallback value for
all traffic types,but lacking in small packet performance and latency.
The hardware can handle many more small packets per second however, and
for this reason an adaptive interrupt moderation algorithm was implemented.
Since 7.3.x, the driver has two adaptive modes (setting 1 or 3) in which
it dynamically adjusts the InterruptThrottleRate value based on the traffic
that it receives. After determining the type of incoming traffic in the last
timeframe, it will adjust the InterruptThrottleRate to an appropriate value
for that traffic.
The algorithm classifies the incoming traffic every interval into
classes. Once the class is determined, the InterruptThrottleRate value is
adjusted to suit that traffic type the best. There are three classes defined:
"Bulk traffic", for large amounts of packets of normal size; "Low latency",
for small amounts of traffic and/or a significant percentage of small
packets; and "Lowest latency", for almost completely small packets or
minimal traffic.
In dynamic conservative mode, the InterruptThrottleRate value is set to 4000
for traffic that falls in class "Bulk traffic". If traffic falls in the "Low
latency" or "Lowest latency" class, the InterruptThrottleRate is increased
stepwise to 20000. This default mode is suitable for most applications.
For situations where low latency is vital such as cluster or
grid computing, the algorithm can reduce latency even more when
InterruptThrottleRate is set to mode 1. In this mode, which operates
the same as mode 3, the InterruptThrottleRate will be increased stepwise to
70000 for traffic in class "Lowest latency".
In simplified mode the interrupt rate is based on the ratio of TX and
RX traffic. If the bytes per second rate is approximately equal, the
interrupt rate will drop as low as 2000 interrupts per second. If the
traffic is mostly transmit or mostly receive, the interrupt rate could
be as high as 8000.
Setting InterruptThrottleRate to 0 turns off any interrupt moderation
and may improve small packet latency, but is generally not suitable
for bulk throughput traffic.
NOTE: InterruptThrottleRate takes precedence over the TxAbsIntDelay and
RxAbsIntDelay parameters. In other words, minimizing the receive
and/or transmit absolute delays does not force the controller to
generate more interrupts than what the Interrupt Throttle Rate
allows.
CAUTION: If you are using the Intel(R) PRO/1000 CT Network Connection
(controller 82547), setting InterruptThrottleRate to a value
greater than 75,000, may hang (stop transmitting) adapters
under certain network conditions. If this occurs a NETDEV
WATCHDOG message is logged in the system event log. In
addition, the controller is automatically reset, restoring
the network connection. To eliminate the potential for the
hang, ensure that InterruptThrottleRate is set no greater
than 75,000 and is not set to 0.
NOTE: When e1000 is loaded with default settings and multiple adapters
are in use simultaneously, the CPU utilization may increase non-
linearly. In order to limit the CPU utilization without impacting
the overall throughput, we recommend that you load the driver as
follows:
modprobe e1000 InterruptThrottleRate=3000,3000,3000
This sets the InterruptThrottleRate to 3000 interrupts/sec for
the first, second, and third instances of the driver. The range
of 2000 to 3000 interrupts per second works on a majority of
systems and is a good starting point, but the optimal value will
be platform-specific. If CPU utilization is not a concern, use
RX_POLLING (NAPI) and default driver settings.
RxDescriptors
-------------
Valid Range: 80-256 for 82542 and 82543-based adapters
80-4096 for all other supported adapters
Default Value: 256
This value specifies the number of receive buffer descriptors allocated
by the driver. Increasing this value allows the driver to buffer more
incoming packets, at the expense of increased system memory utilization.
Each descriptor is 16 bytes. A receive buffer is also allocated for each
descriptor and can be either 2048, 4096, 8192, or 16384 bytes, depending
on the MTU setting. The maximum MTU size is 16110.
NOTE: MTU designates the frame size. It only needs to be set for Jumbo
Frames. Depending on the available system resources, the request
for a higher number of receive descriptors may be denied. In this
case, use a lower number.
RxIntDelay
----------
Valid Range: 0-65535 (0=off)
Default Value: 0
This value delays the generation of receive interrupts in units of 1.024
microseconds. Receive interrupt reduction can improve CPU efficiency if
properly tuned for specific network traffic. Increasing this value adds
extra latency to frame reception and can end up decreasing the throughput
of TCP traffic. If the system is reporting dropped receives, this value
may be set too high, causing the driver to run out of available receive
descriptors.
CAUTION: When setting RxIntDelay to a value other than 0, adapters may
hang (stop transmitting) under certain network conditions. If
this occurs a NETDEV WATCHDOG message is logged in the system
event log. In addition, the controller is automatically reset,
restoring the network connection. To eliminate the potential
for the hang ensure that RxIntDelay is set to 0.
RxAbsIntDelay
-------------
(This parameter is supported only on 82540, 82545 and later adapters.)
Valid Range: 0-65535 (0=off)
Default Value: 128
This value, in units of 1.024 microseconds, limits the delay in which a
receive interrupt is generated. Useful only if RxIntDelay is non-zero,
this value ensures that an interrupt is generated after the initial
packet is received within the set amount of time. Proper tuning,
along with RxIntDelay, may improve traffic throughput in specific network
conditions.
Speed
-----
(This parameter is supported only on adapters with copper connections.)
Valid Settings: 0, 10, 100, 1000
Default Value: 0 (auto-negotiate at all supported speeds)
Speed forces the line speed to the specified value in megabits per second
(Mbps). If this parameter is not specified or is set to 0 and the link
partner is set to auto-negotiate, the board will auto-detect the correct
speed. Duplex should also be set when Speed is set to either 10 or 100.
TxDescriptors
-------------
Valid Range: 80-256 for 82542 and 82543-based adapters
80-4096 for all other supported adapters
Default Value: 256
This value is the number of transmit descriptors allocated by the driver.
Increasing this value allows the driver to queue more transmits. Each
descriptor is 16 bytes.
NOTE: Depending on the available system resources, the request for a
higher number of transmit descriptors may be denied. In this case,
use a lower number.
TxDescriptorStep
----------------
Valid Range: 1 (use every Tx Descriptor)
4 (use every 4th Tx Descriptor)
Default Value: 1 (use every Tx Descriptor)
On certain non-Intel architectures, it has been observed that intense TX
traffic bursts of short packets may result in an improper descriptor
writeback. If this occurs, the driver will report a "TX Timeout" and reset
the adapter, after which the transmit flow will restart, though data may
have stalled for as much as 10 seconds before it resumes.
The improper writeback does not occur on the first descriptor in a system
memory cache-line, which is typically 32 bytes, or 4 descriptors long.
Setting TxDescriptorStep to a value of 4 will ensure that all TX descriptors
are aligned to the start of a system memory cache line, and so this problem
will not occur.
NOTES: Setting TxDescriptorStep to 4 effectively reduces the number of
TxDescriptors available for transmits to 1/4 of the normal allocation.
This has a possible negative performance impact, which may be
compensated for by allocating more descriptors using the TxDescriptors
module parameter.
There are other conditions which may result in "TX Timeout", which will
not be resolved by the use of the TxDescriptorStep parameter. As the
issue addressed by this parameter has never been observed on Intel
Architecture platforms, it should not be used on Intel platforms.
TxIntDelay
----------
Valid Range: 0-65535 (0=off)
Default Value: 64
This value delays the generation of transmit interrupts in units of
1.024 microseconds. Transmit interrupt reduction can improve CPU
efficiency if properly tuned for specific network traffic. If the
system is reporting dropped transmits, this value may be set too high
causing the driver to run out of available transmit descriptors.
TxAbsIntDelay
-------------
(This parameter is supported only on 82540, 82545 and later adapters.)
Valid Range: 0-65535 (0=off)
Default Value: 64
This value, in units of 1.024 microseconds, limits the delay in which a
transmit interrupt is generated. Useful only if TxIntDelay is non-zero,
this value ensures that an interrupt is generated after the initial
packet is sent on the wire within the set amount of time. Proper tuning,
along with TxIntDelay, may improve traffic throughput in specific
network conditions.
XsumRX
------
(This parameter is NOT supported on the 82542-based adapter.)
Valid Range: 0-1
Default Value: 1
A value of '1' indicates that the driver should enable IP checksum
offload for received packets (both UDP and TCP) to the adapter hardware.
Copybreak
---------
Valid Range: 0-xxxxxxx (0=off)
Default Value: 256
Usage: insmod e1000.ko copybreak=128
Driver copies all packets below or equaling this size to a fresh RX
buffer before handing it up the stack.
This parameter is different than other parameters, in that it is a
single (not 1,1,1 etc.) parameter applied to all driver instances and
it is also available during runtime at
/sys/module/e1000/parameters/copybreak
SmartPowerDownEnable
--------------------
Valid Range: 0-1
Default Value: 0 (disabled)
Allows PHY to turn off in lower power states. The user can turn off
this parameter in supported chipsets.
KumeranLockLoss
---------------
Valid Range: 0-1
Default Value: 1 (enabled)
This workaround skips resetting the PHY at shutdown for the initial
silicon releases of ICH8 systems.
Speed and Duplex Configuration
==============================
Three keywords are used to control the speed and duplex configuration.
These keywords are Speed, Duplex, and AutoNeg.
If the board uses a fiber interface, these keywords are ignored, and the
fiber interface board only links at 1000 Mbps full-duplex.
For copper-based boards, the keywords interact as follows:
The default operation is auto-negotiate. The board advertises all
supported speed and duplex combinations, and it links at the highest
common speed and duplex mode IF the link partner is set to auto-negotiate.
If Speed = 1000, limited auto-negotiation is enabled and only 1000 Mbps
is advertised (The 1000BaseT spec requires auto-negotiation.)
If Speed = 10 or 100, then both Speed and Duplex should be set. Auto-
negotiation is disabled, and the AutoNeg parameter is ignored. Partner
SHOULD also be forced.
The AutoNeg parameter is used when more control is required over the
auto-negotiation process. It should be used when you wish to control which
speed and duplex combinations are advertised during the auto-negotiation
process.
The parameter may be specified as either a decimal or hexadecimal value as
determined by the bitmap below.
Bit position 7 6 5 4 3 2 1 0
Decimal Value 128 64 32 16 8 4 2 1
Hex value 80 40 20 10 8 4 2 1
Speed (Mbps) N/A N/A 1000 N/A 100 100 10 10
Duplex Full Full Half Full Half
Some examples of using AutoNeg:
modprobe e1000 AutoNeg=0x01 (Restricts autonegotiation to 10 Half)
modprobe e1000 AutoNeg=1 (Same as above)
modprobe e1000 AutoNeg=0x02 (Restricts autonegotiation to 10 Full)
modprobe e1000 AutoNeg=0x03 (Restricts autonegotiation to 10 Half or 10 Full)
modprobe e1000 AutoNeg=0x04 (Restricts autonegotiation to 100 Half)
modprobe e1000 AutoNeg=0x05 (Restricts autonegotiation to 10 Half or 100
Half)
modprobe e1000 AutoNeg=0x020 (Restricts autonegotiation to 1000 Full)
modprobe e1000 AutoNeg=32 (Same as above)
Note that when this parameter is used, Speed and Duplex must not be specified.
If the link partner is forced to a specific speed and duplex, then this
parameter should not be used. Instead, use the Speed and Duplex parameters
previously mentioned to force the adapter to the same speed and duplex.
Additional Configurations
=========================
Jumbo Frames
------------
Jumbo Frames support is enabled by changing the MTU to a value larger than
the default of 1500. Use the ifconfig command to increase the MTU size.
For example:
ifconfig eth<x> mtu 9000 up
This setting is not saved across reboots. It can be made permanent if
you add:
MTU=9000
to the file /etc/sysconfig/network-scripts/ifcfg-eth<x>. This example
applies to the Red Hat distributions; other distributions may store this
setting in a different location.
Notes:
Degradation in throughput performance may be observed in some Jumbo frames
environments. If this is observed, increasing the application's socket buffer
size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values may help.
See the specific application manual and /usr/src/linux*/Documentation/
networking/ip-sysctl.txt for more details.
- The maximum MTU setting for Jumbo Frames is 16110. This value coincides
with the maximum Jumbo Frames size of 16128.
- Using Jumbo frames at 10 or 100 Mbps is not supported and may result in
poor performance or loss of link.
- Adapters based on the Intel(R) 82542 and 82573V/E controller do not
support Jumbo Frames. These correspond to the following product names:
Intel(R) PRO/1000 Gigabit Server Adapter
Intel(R) PRO/1000 PM Network Connection
ethtool
-------
The driver utilizes the ethtool interface for driver configuration and
diagnostics, as well as displaying statistical information. The ethtool
version 1.6 or later is required for this functionality.
The latest release of ethtool can be found from
http://ftp.kernel.org/pub/software/network/ethtool/
Enabling Wake on LAN* (WoL)
---------------------------
WoL is configured through the ethtool* utility.
WoL will be enabled on the system during the next shut down or reboot.
For this driver version, in order to enable WoL, the e1000 driver must be
loaded when shutting down or rebooting the system.
Support
=======
For general information, go to the Intel support website at:
http://support.intel.com
or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related
to the issue to e1000-devel@lists.sf.net

View file

@ -0,0 +1,312 @@
Linux* Driver for Intel(R) Ethernet Network Connection
======================================================
Intel Gigabit Linux driver.
Copyright(c) 1999 - 2013 Intel Corporation.
Contents
========
- Identifying Your Adapter
- Command Line Parameters
- Additional Configurations
- Support
Identifying Your Adapter
========================
The e1000e driver supports all PCI Express Intel(R) Gigabit Network
Connections, except those that are 82575, 82576 and 82580-based*.
* NOTE: The Intel(R) PRO/1000 P Dual Port Server Adapter is supported by
the e1000 driver, not the e1000e driver due to the 82546 part being used
behind a PCI Express bridge.
For more information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/go/network/adapter/idguide.htm
For the latest Intel network drivers for Linux, refer to the following
website. In the search field, enter your adapter name or type, or use the
networking link on the left to search for your adapter:
http://support.intel.com/support/go/network/adapter/home.htm
Command Line Parameters
=======================
The default value for each parameter is generally the recommended setting,
unless otherwise noted.
NOTES: For more information about the InterruptThrottleRate,
RxIntDelay, TxIntDelay, RxAbsIntDelay, and TxAbsIntDelay
parameters, see the application note at:
http://www.intel.com/design/network/applnots/ap450.htm
InterruptThrottleRate
---------------------
Valid Range: 0,1,3,4,100-100000 (0=off, 1=dynamic, 3=dynamic conservative,
4=simplified balancing)
Default Value: 3
The driver can limit the amount of interrupts per second that the adapter
will generate for incoming packets. It does this by writing a value to the
adapter that is based on the maximum amount of interrupts that the adapter
will generate per second.
Setting InterruptThrottleRate to a value greater or equal to 100
will program the adapter to send out a maximum of that many interrupts
per second, even if more packets have come in. This reduces interrupt
load on the system and can lower CPU utilization under heavy load,
but will increase latency as packets are not processed as quickly.
The default behaviour of the driver previously assumed a static
InterruptThrottleRate value of 8000, providing a good fallback value for
all traffic types, but lacking in small packet performance and latency.
The hardware can handle many more small packets per second however, and
for this reason an adaptive interrupt moderation algorithm was implemented.
The driver has two adaptive modes (setting 1 or 3) in which
it dynamically adjusts the InterruptThrottleRate value based on the traffic
that it receives. After determining the type of incoming traffic in the last
timeframe, it will adjust the InterruptThrottleRate to an appropriate value
for that traffic.
The algorithm classifies the incoming traffic every interval into
classes. Once the class is determined, the InterruptThrottleRate value is
adjusted to suit that traffic type the best. There are three classes defined:
"Bulk traffic", for large amounts of packets of normal size; "Low latency",
for small amounts of traffic and/or a significant percentage of small
packets; and "Lowest latency", for almost completely small packets or
minimal traffic.
In dynamic conservative mode, the InterruptThrottleRate value is set to 4000
for traffic that falls in class "Bulk traffic". If traffic falls in the "Low
latency" or "Lowest latency" class, the InterruptThrottleRate is increased
stepwise to 20000. This default mode is suitable for most applications.
For situations where low latency is vital such as cluster or
grid computing, the algorithm can reduce latency even more when
InterruptThrottleRate is set to mode 1. In this mode, which operates
the same as mode 3, the InterruptThrottleRate will be increased stepwise to
70000 for traffic in class "Lowest latency".
In simplified mode the interrupt rate is based on the ratio of TX and
RX traffic. If the bytes per second rate is approximately equal, the
interrupt rate will drop as low as 2000 interrupts per second. If the
traffic is mostly transmit or mostly receive, the interrupt rate could
be as high as 8000.
Setting InterruptThrottleRate to 0 turns off any interrupt moderation
and may improve small packet latency, but is generally not suitable
for bulk throughput traffic.
NOTE: InterruptThrottleRate takes precedence over the TxAbsIntDelay and
RxAbsIntDelay parameters. In other words, minimizing the receive
and/or transmit absolute delays does not force the controller to
generate more interrupts than what the Interrupt Throttle Rate
allows.
NOTE: When e1000e is loaded with default settings and multiple adapters
are in use simultaneously, the CPU utilization may increase non-
linearly. In order to limit the CPU utilization without impacting
the overall throughput, we recommend that you load the driver as
follows:
modprobe e1000e InterruptThrottleRate=3000,3000,3000
This sets the InterruptThrottleRate to 3000 interrupts/sec for
the first, second, and third instances of the driver. The range
of 2000 to 3000 interrupts per second works on a majority of
systems and is a good starting point, but the optimal value will
be platform-specific. If CPU utilization is not a concern, use
RX_POLLING (NAPI) and default driver settings.
RxIntDelay
----------
Valid Range: 0-65535 (0=off)
Default Value: 0
This value delays the generation of receive interrupts in units of 1.024
microseconds. Receive interrupt reduction can improve CPU efficiency if
properly tuned for specific network traffic. Increasing this value adds
extra latency to frame reception and can end up decreasing the throughput
of TCP traffic. If the system is reporting dropped receives, this value
may be set too high, causing the driver to run out of available receive
descriptors.
CAUTION: When setting RxIntDelay to a value other than 0, adapters may
hang (stop transmitting) under certain network conditions. If
this occurs a NETDEV WATCHDOG message is logged in the system
event log. In addition, the controller is automatically reset,
restoring the network connection. To eliminate the potential
for the hang ensure that RxIntDelay is set to 0.
RxAbsIntDelay
-------------
Valid Range: 0-65535 (0=off)
Default Value: 8
This value, in units of 1.024 microseconds, limits the delay in which a
receive interrupt is generated. Useful only if RxIntDelay is non-zero,
this value ensures that an interrupt is generated after the initial
packet is received within the set amount of time. Proper tuning,
along with RxIntDelay, may improve traffic throughput in specific network
conditions.
TxIntDelay
----------
Valid Range: 0-65535 (0=off)
Default Value: 8
This value delays the generation of transmit interrupts in units of
1.024 microseconds. Transmit interrupt reduction can improve CPU
efficiency if properly tuned for specific network traffic. If the
system is reporting dropped transmits, this value may be set too high
causing the driver to run out of available transmit descriptors.
TxAbsIntDelay
-------------
Valid Range: 0-65535 (0=off)
Default Value: 32
This value, in units of 1.024 microseconds, limits the delay in which a
transmit interrupt is generated. Useful only if TxIntDelay is non-zero,
this value ensures that an interrupt is generated after the initial
packet is sent on the wire within the set amount of time. Proper tuning,
along with TxIntDelay, may improve traffic throughput in specific
network conditions.
Copybreak
---------
Valid Range: 0-xxxxxxx (0=off)
Default Value: 256
Driver copies all packets below or equaling this size to a fresh RX
buffer before handing it up the stack.
This parameter is different than other parameters, in that it is a
single (not 1,1,1 etc.) parameter applied to all driver instances and
it is also available during runtime at
/sys/module/e1000e/parameters/copybreak
SmartPowerDownEnable
--------------------
Valid Range: 0-1
Default Value: 0 (disabled)
Allows PHY to turn off in lower power states. The user can set this parameter
in supported chipsets.
KumeranLockLoss
---------------
Valid Range: 0-1
Default Value: 1 (enabled)
This workaround skips resetting the PHY at shutdown for the initial
silicon releases of ICH8 systems.
IntMode
-------
Valid Range: 0-2 (0=legacy, 1=MSI, 2=MSI-X)
Default Value: 2
Allows changing the interrupt mode at module load time, without requiring a
recompile. If the driver load fails to enable a specific interrupt mode, the
driver will try other interrupt modes, from least to most compatible. The
interrupt order is MSI-X, MSI, Legacy. If specifying MSI (IntMode=1)
interrupts, only MSI and Legacy will be attempted.
CrcStripping
------------
Valid Range: 0-1
Default Value: 1 (enabled)
Strip the CRC from received packets before sending up the network stack. If
you have a machine with a BMC enabled but cannot receive IPMI traffic after
loading or enabling the driver, try disabling this feature.
WriteProtectNVM
---------------
Valid Range: 0,1
Default Value: 1
If set to 1, configure the hardware to ignore all write/erase cycles to the
GbE region in the ICHx NVM (in order to prevent accidental corruption of the
NVM). This feature can be disabled by setting the parameter to 0 during initial
driver load.
NOTE: The machine must be power cycled (full off/on) when enabling NVM writes
via setting the parameter to zero. Once the NVM has been locked (via the
parameter at 1 when the driver loads) it cannot be unlocked except via power
cycle.
Additional Configurations
=========================
Jumbo Frames
------------
Jumbo Frames support is enabled by changing the MTU to a value larger than
the default of 1500. Use the ifconfig command to increase the MTU size.
For example:
ifconfig eth<x> mtu 9000 up
This setting is not saved across reboots.
Notes:
- The maximum MTU setting for Jumbo Frames is 9216. This value coincides
with the maximum Jumbo Frames size of 9234 bytes.
- Using Jumbo frames at 10 or 100 Mbps is not supported and may result in
poor performance or loss of link.
- Some adapters limit Jumbo Frames sized packets to a maximum of
4096 bytes and some adapters do not support Jumbo Frames.
- Jumbo Frames cannot be configured on an 82579-based Network device, if
MACSec is enabled on the system.
ethtool
-------
The driver utilizes the ethtool interface for driver configuration and
diagnostics, as well as displaying statistical information. We
strongly recommend downloading the latest version of ethtool at:
http://ftp.kernel.org/pub/software/network/ethtool/
NOTE: When validating enable/disable tests on some parts (82578, for example)
you need to add a few seconds between tests when working with ethtool.
Speed and Duplex
----------------
Speed and Duplex are configured through the ethtool* utility. For
instructions, refer to the ethtool man page.
Enabling Wake on LAN* (WoL)
---------------------------
WoL is configured through the ethtool* utility. For instructions on
enabling WoL with ethtool, refer to the ethtool man page.
WoL will be enabled on the system during the next shut down or reboot.
For this driver version, in order to enable WoL, the e1000e driver must be
loaded when shutting down or rebooting the system.
In most cases Wake On LAN is only supported on port A for multiple port
adapters. To verify if a port supports Wake on Lan run ethtool eth<X>.
Support
=======
For general information, go to the Intel support website at:
www.intel.com/support/
or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related
to the issue to e1000-devel@lists.sf.net

View file

@ -0,0 +1,528 @@
EQL Driver: Serial IP Load Balancing HOWTO
Simon "Guru Aleph-Null" Janes, simon@ncm.com
v1.1, February 27, 1995
This is the manual for the EQL device driver. EQL is a software device
that lets you load-balance IP serial links (SLIP or uncompressed PPP)
to increase your bandwidth. It will not reduce your latency (i.e. ping
times) except in the case where you already have lots of traffic on
your link, in which it will help them out. This driver has been tested
with the 1.1.75 kernel, and is known to have patched cleanly with
1.1.86. Some testing with 1.1.92 has been done with the v1.1 patch
which was only created to patch cleanly in the very latest kernel
source trees. (Yes, it worked fine.)
1. Introduction
Which is worse? A huge fee for a 56K leased line or two phone lines?
It's probably the former. If you find yourself craving more bandwidth,
and have a ISP that is flexible, it is now possible to bind modems
together to work as one point-to-point link to increase your
bandwidth. All without having to have a special black box on either
side.
The eql driver has only been tested with the Livingston PortMaster-2e
terminal server. I do not know if other terminal servers support load-
balancing, but I do know that the PortMaster does it, and does it
almost as well as the eql driver seems to do it (-- Unfortunately, in
my testing so far, the Livingston PortMaster 2e's load-balancing is a
good 1 to 2 KB/s slower than the test machine working with a 28.8 Kbps
and 14.4 Kbps connection. However, I am not sure that it really is
the PortMaster, or if it's Linux's TCP drivers. I'm told that Linux's
TCP implementation is pretty fast though.--)
I suggest to ISPs out there that it would probably be fair to charge
a load-balancing client 75% of the cost of the second line and 50% of
the cost of the third line etc...
Hey, we can all dream you know...
2. Kernel Configuration
Here I describe the general steps of getting a kernel up and working
with the eql driver. From patching, building, to installing.
2.1. Patching The Kernel
If you do not have or cannot get a copy of the kernel with the eql
driver folded into it, get your copy of the driver from
ftp://slaughter.ncm.com/pub/Linux/LOAD_BALANCING/eql-1.1.tar.gz.
Unpack this archive someplace obvious like /usr/local/src/. It will
create the following files:
______________________________________________________________________
-rw-r--r-- guru/ncm 198 Jan 19 18:53 1995 eql-1.1/NO-WARRANTY
-rw-r--r-- guru/ncm 30620 Feb 27 21:40 1995 eql-1.1/eql-1.1.patch
-rwxr-xr-x guru/ncm 16111 Jan 12 22:29 1995 eql-1.1/eql_enslave
-rw-r--r-- guru/ncm 2195 Jan 10 21:48 1995 eql-1.1/eql_enslave.c
______________________________________________________________________
Unpack a recent kernel (something after 1.1.92) someplace convenient
like say /usr/src/linux-1.1.92.eql. Use symbolic links to point
/usr/src/linux to this development directory.
Apply the patch by running the commands:
______________________________________________________________________
cd /usr/src
patch </usr/local/src/eql-1.1/eql-1.1.patch
______________________________________________________________________
2.2. Building The Kernel
After patching the kernel, run make config and configure the kernel
for your hardware.
After configuration, make and install according to your habit.
3. Network Configuration
So far, I have only used the eql device with the DSLIP SLIP connection
manager by Matt Dillon (-- "The man who sold his soul to code so much
so quickly."--) . How you configure it for other "connection"
managers is up to you. Most other connection managers that I've seen
don't do a very good job when it comes to handling more than one
connection.
3.1. /etc/rc.d/rc.inet1
In rc.inet1, ifconfig the eql device to the IP address you usually use
for your machine, and the MTU you prefer for your SLIP lines. One
could argue that MTU should be roughly half the usual size for two
modems, one-third for three, one-fourth for four, etc... But going
too far below 296 is probably overkill. Here is an example ifconfig
command that sets up the eql device:
______________________________________________________________________
ifconfig eql 198.67.33.239 mtu 1006
______________________________________________________________________
Once the eql device is up and running, add a static default route to
it in the routing table using the cool new route syntax that makes
life so much easier:
______________________________________________________________________
route add default eql
______________________________________________________________________
3.2. Enslaving Devices By Hand
Enslaving devices by hand requires two utility programs: eql_enslave
and eql_emancipate (-- eql_emancipate hasn't been written because when
an enslaved device "dies", it is automatically taken out of the queue.
I haven't found a good reason to write it yet... other than for
completeness, but that isn't a good motivator is it?--)
The syntax for enslaving a device is "eql_enslave <master-name>
<slave-name> <estimated-bps>". Here are some example enslavings:
______________________________________________________________________
eql_enslave eql sl0 28800
eql_enslave eql ppp0 14400
eql_enslave eql sl1 57600
______________________________________________________________________
When you want to free a device from its life of slavery, you can
either down the device with ifconfig (eql will automatically bury the
dead slave and remove it from its queue) or use eql_emancipate to free
it. (-- Or just ifconfig it down, and the eql driver will take it out
for you.--)
______________________________________________________________________
eql_emancipate eql sl0
eql_emancipate eql ppp0
eql_emancipate eql sl1
______________________________________________________________________
3.3. DSLIP Configuration for the eql Device
The general idea is to bring up and keep up as many SLIP connections
as you need, automatically.
3.3.1. /etc/slip/runslip.conf
Here is an example runslip.conf:
______________________________________________________________________
name sl-line-1
enabled
baud 38400
mtu 576
ducmd -e /etc/slip/dialout/cua2-288.xp -t 9
command eql_enslave eql $interface 28800
address 198.67.33.239
line /dev/cua2
name sl-line-2
enabled
baud 38400
mtu 576
ducmd -e /etc/slip/dialout/cua3-288.xp -t 9
command eql_enslave eql $interface 28800
address 198.67.33.239
line /dev/cua3
______________________________________________________________________
3.4. Using PPP and the eql Device
I have not yet done any load-balancing testing for PPP devices, mainly
because I don't have a PPP-connection manager like SLIP has with
DSLIP. I did find a good tip from LinuxNET:Billy for PPP performance:
make sure you have asyncmap set to something so that control
characters are not escaped.
I tried to fix up a PPP script/system for redialing lost PPP
connections for use with the eql driver the weekend of Feb 25-26 '95
(Hereafter known as the 8-hour PPP Hate Festival). Perhaps later this
year.
4. About the Slave Scheduler Algorithm
The slave scheduler probably could be replaced with a dozen other
things and push traffic much faster. The formula in the current set
up of the driver was tuned to handle slaves with wildly different
bits-per-second "priorities".
All testing I have done was with two 28.8 V.FC modems, one connecting
at 28800 bps or slower, and the other connecting at 14400 bps all the
time.
One version of the scheduler was able to push 5.3 K/s through the
28800 and 14400 connections, but when the priorities on the links were
very wide apart (57600 vs. 14400) the "faster" modem received all
traffic and the "slower" modem starved.
5. Testers' Reports
Some people have experimented with the eql device with newer
kernels (than 1.1.75). I have since updated the driver to patch
cleanly in newer kernels because of the removal of the old "slave-
balancing" driver config option.
o icee from LinuxNET patched 1.1.86 without any rejects and was able
to boot the kernel and enslave a couple of ISDN PPP links.
5.1. Randolph Bentson's Test Report
From bentson@grieg.seaslug.org Wed Feb 8 19:08:09 1995
Date: Tue, 7 Feb 95 22:57 PST
From: Randolph Bentson <bentson@grieg.seaslug.org>
To: guru@ncm.com
Subject: EQL driver tests
I have been checking out your eql driver. (Nice work, that!)
Although you may already done this performance testing, here
are some data I've discovered.
Randolph Bentson
bentson@grieg.seaslug.org
---------------------------------------------------------
A pseudo-device driver, EQL, written by Simon Janes, can be used
to bundle multiple SLIP connections into what appears to be a
single connection. This allows one to improve dial-up network
connectivity gradually, without having to buy expensive DSU/CSU
hardware and services.
I have done some testing of this software, with two goals in
mind: first, to ensure it actually works as described and
second, as a method of exercising my device driver.
The following performance measurements were derived from a set
of SLIP connections run between two Linux systems (1.1.84) using
a 486DX2/66 with a Cyclom-8Ys and a 486SLC/40 with a Cyclom-16Y.
(Ports 0,1,2,3 were used. A later configuration will distribute
port selection across the different Cirrus chips on the boards.)
Once a link was established, I timed a binary ftp transfer of
289284 bytes of data. If there were no overhead (packet headers,
inter-character and inter-packet delays, etc.) the transfers
would take the following times:
bits/sec seconds
345600 8.3
234600 12.3
172800 16.7
153600 18.8
76800 37.6
57600 50.2
38400 75.3
28800 100.4
19200 150.6
9600 301.3
A single line running at the lower speeds and with large packets
comes to within 2% of this. Performance is limited for the higher
speeds (as predicted by the Cirrus databook) to an aggregate of
about 160 kbits/sec. The next round of testing will distribute
the load across two or more Cirrus chips.
The good news is that one gets nearly the full advantage of the
second, third, and fourth line's bandwidth. (The bad news is
that the connection establishment seemed fragile for the higher
speeds. Once established, the connection seemed robust enough.)
#lines speed mtu seconds theory actual %of
kbit/sec duration speed speed max
3 115200 900 _ 345600
3 115200 400 18.1 345600 159825 46
2 115200 900 _ 230400
2 115200 600 18.1 230400 159825 69
2 115200 400 19.3 230400 149888 65
4 57600 900 _ 234600
4 57600 600 _ 234600
4 57600 400 _ 234600
3 57600 600 20.9 172800 138413 80
3 57600 900 21.2 172800 136455 78
3 115200 600 21.7 345600 133311 38
3 57600 400 22.5 172800 128571 74
4 38400 900 25.2 153600 114795 74
4 38400 600 26.4 153600 109577 71
4 38400 400 27.3 153600 105965 68
2 57600 900 29.1 115200 99410.3 86
1 115200 900 30.7 115200 94229.3 81
2 57600 600 30.2 115200 95789.4 83
3 38400 900 30.3 115200 95473.3 82
3 38400 600 31.2 115200 92719.2 80
1 115200 600 31.3 115200 92423 80
2 57600 400 32.3 115200 89561.6 77
1 115200 400 32.8 115200 88196.3 76
3 38400 400 33.5 115200 86353.4 74
2 38400 900 43.7 76800 66197.7 86
2 38400 600 44 76800 65746.4 85
2 38400 400 47.2 76800 61289 79
4 19200 900 50.8 76800 56945.7 74
4 19200 400 53.2 76800 54376.7 70
4 19200 600 53.7 76800 53870.4 70
1 57600 900 54.6 57600 52982.4 91
1 57600 600 56.2 57600 51474 89
3 19200 900 60.5 57600 47815.5 83
1 57600 400 60.2 57600 48053.8 83
3 19200 600 62 57600 46658.7 81
3 19200 400 64.7 57600 44711.6 77
1 38400 900 79.4 38400 36433.8 94
1 38400 600 82.4 38400 35107.3 91
2 19200 900 84.4 38400 34275.4 89
1 38400 400 86.8 38400 33327.6 86
2 19200 600 87.6 38400 33023.3 85
2 19200 400 91.2 38400 31719.7 82
4 9600 900 94.7 38400 30547.4 79
4 9600 400 106 38400 27290.9 71
4 9600 600 110 38400 26298.5 68
3 9600 900 118 28800 24515.6 85
3 9600 600 120 28800 24107 83
3 9600 400 131 28800 22082.7 76
1 19200 900 155 19200 18663.5 97
1 19200 600 161 19200 17968 93
1 19200 400 170 19200 17016.7 88
2 9600 600 176 19200 16436.6 85
2 9600 900 180 19200 16071.3 83
2 9600 400 181 19200 15982.5 83
1 9600 900 305 9600 9484.72 98
1 9600 600 314 9600 9212.87 95
1 9600 400 332 9600 8713.37 90
5.2. Anthony Healy's Report
Date: Mon, 13 Feb 1995 16:17:29 +1100 (EST)
From: Antony Healey <ahealey@st.nepean.uws.edu.au>
To: Simon Janes <guru@ncm.com>
Subject: Re: Load Balancing
Hi Simon,
I've installed your patch and it works great. I have trialed
it over twin SL/IP lines, just over null modems, but I was
able to data at over 48Kb/s [ISDN link -Simon]. I managed a
transfer of up to 7.5 Kbyte/s on one go, but averaged around
6.4 Kbyte/s, which I think is pretty cool. :)

View file

@ -0,0 +1,145 @@
LC-trie implementation notes.
Node types
----------
leaf
An end node with data. This has a copy of the relevant key, along
with 'hlist' with routing table entries sorted by prefix length.
See struct leaf and struct leaf_info.
trie node or tnode
An internal node, holding an array of child (leaf or tnode) pointers,
indexed through a subset of the key. See Level Compression.
A few concepts explained
------------------------
Bits (tnode)
The number of bits in the key segment used for indexing into the
child array - the "child index". See Level Compression.
Pos (tnode)
The position (in the key) of the key segment used for indexing into
the child array. See Path Compression.
Path Compression / skipped bits
Any given tnode is linked to from the child array of its parent, using
a segment of the key specified by the parent's "pos" and "bits"
In certain cases, this tnode's own "pos" will not be immediately
adjacent to the parent (pos+bits), but there will be some bits
in the key skipped over because they represent a single path with no
deviations. These "skipped bits" constitute Path Compression.
Note that the search algorithm will simply skip over these bits when
searching, making it necessary to save the keys in the leaves to
verify that they actually do match the key we are searching for.
Level Compression / child arrays
the trie is kept level balanced moving, under certain conditions, the
children of a full child (see "full_children") up one level, so that
instead of a pure binary tree, each internal node ("tnode") may
contain an arbitrarily large array of links to several children.
Conversely, a tnode with a mostly empty child array (see empty_children)
may be "halved", having some of its children moved downwards one level,
in order to avoid ever-increasing child arrays.
empty_children
the number of positions in the child array of a given tnode that are
NULL.
full_children
the number of children of a given tnode that aren't path compressed.
(in other words, they aren't NULL or leaves and their "pos" is equal
to this tnode's "pos"+"bits").
(The word "full" here is used more in the sense of "complete" than
as the opposite of "empty", which might be a tad confusing.)
Comments
---------
We have tried to keep the structure of the code as close to fib_hash as
possible to allow verification and help up reviewing.
fib_find_node()
A good start for understanding this code. This function implements a
straightforward trie lookup.
fib_insert_node()
Inserts a new leaf node in the trie. This is bit more complicated than
fib_find_node(). Inserting a new node means we might have to run the
level compression algorithm on part of the trie.
trie_leaf_remove()
Looks up a key, deletes it and runs the level compression algorithm.
trie_rebalance()
The key function for the dynamic trie after any change in the trie
it is run to optimize and reorganize. Tt will walk the trie upwards
towards the root from a given tnode, doing a resize() at each step
to implement level compression.
resize()
Analyzes a tnode and optimizes the child array size by either inflating
or shrinking it repeatedly until it fulfills the criteria for optimal
level compression. This part follows the original paper pretty closely
and there may be some room for experimentation here.
inflate()
Doubles the size of the child array within a tnode. Used by resize().
halve()
Halves the size of the child array within a tnode - the inverse of
inflate(). Used by resize();
fn_trie_insert(), fn_trie_delete(), fn_trie_select_default()
The route manipulation functions. Should conform pretty closely to the
corresponding functions in fib_hash.
fn_trie_flush()
This walks the full trie (using nextleaf()) and searches for empty
leaves which have to be removed.
fn_trie_dump()
Dumps the routing table ordered by prefix length. This is somewhat
slower than the corresponding fib_hash function, as we have to walk the
entire trie for each prefix length. In comparison, fib_hash is organized
as one "zone"/hash per prefix length.
Locking
-------
fib_lock is used for an RW-lock in the same way that this is done in fib_hash.
However, the functions are somewhat separated for other possible locking
scenarios. It might conceivably be possible to run trie_rebalance via RCU
to avoid read_lock in the fn_trie_lookup() function.
Main lookup mechanism
---------------------
fn_trie_lookup() is the main lookup function.
The lookup is in its simplest form just like fib_find_node(). We descend the
trie, key segment by key segment, until we find a leaf. check_leaf() does
the fib_semantic_match in the leaf's sorted prefix hlist.
If we find a match, we are done.
If we don't find a match, we enter prefix matching mode. The prefix length,
starting out at the same as the key length, is reduced one step at a time,
and we backtrack upwards through the trie trying to find a longest matching
prefix. The goal is always to reach a leaf and get a positive result from the
fib_semantic_match mechanism.
Inside each tnode, the search for longest matching prefix consists of searching
through the child array, chopping off (zeroing) the least significant "1" of
the child index until we find a match or the child index consists of nothing but
zeros.
At this point we backtrack (t->stats.backtrack++) up the trie, continuing to
chop off part of the key in order to find the longest matching prefix.
At this point we will repeatedly descend subtries to look for a match, and there
are some optimizations available that can provide us with "shortcuts" to avoid
descending into dead ends. Look for "HL_OPTIMIZE" sections in the code.
To alleviate any doubts about the correctness of the route selection process,
a new netlink operation has been added. Look for NETLINK_FIB_LOOKUP, which
gives userland access to fib_lookup().

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,64 @@
FORE Systems PCA-200E/SBA-200E ATM NIC driver
---------------------------------------------
This driver adds support for the FORE Systems 200E-series ATM adapters
to the Linux operating system. It is based on the earlier PCA-200E driver
written by Uwe Dannowski.
The driver simultaneously supports PCA-200E and SBA-200E adapters on
i386, alpha (untested), powerpc, sparc and sparc64 archs.
The intent is to enable the use of different models of FORE adapters at the
same time, by hosts that have several bus interfaces (such as PCI+SBUS,
or PCI+EISA).
Only PCI and SBUS devices are currently supported by the driver, but support
for other bus interfaces such as EISA should not be too hard to add.
Firmware Copyright Notice
-------------------------
Please read the fore200e_firmware_copyright file present
in the linux/drivers/atm directory for details and restrictions.
Firmware Updates
----------------
The FORE Systems 200E-series driver is shipped with firmware data being
uploaded to the ATM adapters at system boot time or at module loading time.
The supplied firmware images should work with all adapters.
However, if you encounter problems (the firmware doesn't start or the driver
is unable to read the PROM data), you may consider trying another firmware
version. Alternative binary firmware images can be found somewhere on the
ForeThought CD-ROM supplied with your adapter by FORE Systems.
You can also get the latest firmware images from FORE Systems at
http://en.wikipedia.org/wiki/FORE_Systems. Register TACTics Online and go to
the 'software updates' pages. The firmware binaries are part of
the various ForeThought software distributions.
Notice that different versions of the PCA-200E firmware exist, depending
on the endianness of the host architecture. The driver is shipped with
both little and big endian PCA firmware images.
Name and location of the new firmware images can be set at kernel
configuration time:
1. Copy the new firmware binary files (with .bin, .bin1 or .bin2 suffix)
to some directory, such as linux/drivers/atm.
2. Reconfigure your kernel to set the new firmware name and location.
Expected pathnames are absolute or relative to the drivers/atm directory.
3. Rebuild and re-install your kernel or your module.
Feedback
--------
Feedback is welcome. Please send success stories/bug reports/
patches/improvement/comments/flames to <lizzi@cnam.fr>.

View file

@ -0,0 +1,39 @@
Frame Relay (FR) support for linux is built into a two tiered system of device
drivers. The upper layer implements RFC1490 FR specification, and uses the
Data Link Connection Identifier (DLCI) as its hardware address. Usually these
are assigned by your network supplier, they give you the number/numbers of
the Virtual Connections (VC) assigned to you.
Each DLCI is a point-to-point link between your machine and a remote one.
As such, a separate device is needed to accommodate the routing. Within the
net-tools archives is 'dlcicfg'. This program will communicate with the
base "DLCI" device, and create new net devices named 'dlci00', 'dlci01'...
The configuration script will ask you how many DLCIs you need, as well as
how many DLCIs you want to assign to each Frame Relay Access Device (FRAD).
The DLCI uses a number of function calls to communicate with the FRAD, all
of which are stored in the FRAD's private data area. assoc/deassoc,
activate/deactivate and dlci_config. The DLCI supplies a receive function
to the FRAD to accept incoming packets.
With this initial offering, only 1 FRAD driver is available. With many thanks
to Sangoma Technologies, David Mandelstam & Gene Kozin, the S502A, S502E &
S508 are supported. This driver is currently set up for only FR, but as
Sangoma makes more firmware modules available, it can be updated to provide
them as well.
Configuration of the FRAD makes use of another net-tools program, 'fradcfg'.
This program makes use of a configuration file (which dlcicfg can also read)
to specify the types of boards to be configured as FRADs, as well as perform
any board specific configuration. The Sangoma module of fradcfg loads the
FR firmware into the card, sets the irq/port/memory information, and provides
an initial configuration.
Additional FRAD device drivers can be added as hardware is available.
At this time, the dlcicfg and fradcfg programs have not been incorporated into
the net-tools distribution. They can be found at ftp.invlogic.com, in
/pub/linux. Note that with OS/2 FTPD, you end up in /pub by default, so just
use 'cd linux'. v0.10 is for use on pre-2.0.3 and earlier, v0.15 is for
pre-2.0.4 and later.

View file

@ -0,0 +1,117 @@
Generic networking statistics for netlink users
======================================================================
Statistic counters are grouped into structs:
Struct TLV type Description
----------------------------------------------------------------------
gnet_stats_basic TCA_STATS_BASIC Basic statistics
gnet_stats_rate_est TCA_STATS_RATE_EST Rate estimator
gnet_stats_queue TCA_STATS_QUEUE Queue statistics
none TCA_STATS_APP Application specific
Collecting:
-----------
Declare the statistic structs you need:
struct mystruct {
struct gnet_stats_basic bstats;
struct gnet_stats_queue qstats;
...
};
Update statistics:
mystruct->tstats.packet++;
mystruct->qstats.backlog += skb->pkt_len;
Export to userspace (Dump):
---------------------------
my_dumping_routine(struct sk_buff *skb, ...)
{
struct gnet_dump dump;
if (gnet_stats_start_copy(skb, TCA_STATS2, &mystruct->lock, &dump) < 0)
goto rtattr_failure;
if (gnet_stats_copy_basic(&dump, &mystruct->bstats) < 0 ||
gnet_stats_copy_queue(&dump, &mystruct->qstats) < 0 ||
gnet_stats_copy_app(&dump, &xstats, sizeof(xstats)) < 0)
goto rtattr_failure;
if (gnet_stats_finish_copy(&dump) < 0)
goto rtattr_failure;
...
}
TCA_STATS/TCA_XSTATS backward compatibility:
--------------------------------------------
Prior users of struct tc_stats and xstats can maintain backward
compatibility by calling the compat wrappers to keep providing the
existing TLV types.
my_dumping_routine(struct sk_buff *skb, ...)
{
if (gnet_stats_start_copy_compat(skb, TCA_STATS2, TCA_STATS,
TCA_XSTATS, &mystruct->lock, &dump) < 0)
goto rtattr_failure;
...
}
A struct tc_stats will be filled out during gnet_stats_copy_* calls
and appended to the skb. TCA_XSTATS is provided if gnet_stats_copy_app
was called.
Locking:
--------
Locks are taken before writing and released once all statistics have
been written. Locks are always released in case of an error. You
are responsible for making sure that the lock is initialized.
Rate Estimator:
--------------
0) Prepare an estimator attribute. Most likely this would be in user
space. The value of this TLV should contain a tc_estimator structure.
As usual, such a TLV needs to be 32 bit aligned and therefore the
length needs to be appropriately set, etc. The estimator interval
and ewma log need to be converted to the appropriate values.
tc_estimator.c::tc_setup_estimator() is advisable to be used as the
conversion routine. It does a few clever things. It takes a time
interval in microsecs, a time constant also in microsecs and a struct
tc_estimator to be populated. The returned tc_estimator can be
transported to the kernel. Transfer such a structure in a TLV of type
TCA_RATE to your code in the kernel.
In the kernel when setting up:
1) make sure you have basic stats and rate stats setup first.
2) make sure you have initialized stats lock that is used to setup such
stats.
3) Now initialize a new estimator:
int ret = gen_new_estimator(my_basicstats,my_rate_est_stats,
mystats_lock, attr_with_tcestimator_struct);
if ret == 0
success
else
failed
From now on, every time you dump my_rate_est_stats it will contain
up-to-date info.
Once you are done, call gen_kill_estimator(my_basicstats,
my_rate_est_stats) Make sure that my_basicstats and my_rate_est_stats
are still valid (i.e still exist) at the time of making this call.
Authors:
--------
Thomas Graf <tgraf@suug.ch>
Jamal Hadi Salim <hadi@cyberus.ca>

View file

@ -0,0 +1,132 @@
Generic HDLC layer
Krzysztof Halasa <khc@pm.waw.pl>
Generic HDLC layer currently supports:
1. Frame Relay (ANSI, CCITT, Cisco and no LMI)
- Normal (routed) and Ethernet-bridged (Ethernet device emulation)
interfaces can share a single PVC.
- ARP support (no InARP support in the kernel - there is an
experimental InARP user-space daemon available on:
http://www.kernel.org/pub/linux/utils/net/hdlc/).
2. raw HDLC - either IP (IPv4) interface or Ethernet device emulation
3. Cisco HDLC
4. PPP
5. X.25 (uses X.25 routines).
Generic HDLC is a protocol driver only - it needs a low-level driver
for your particular hardware.
Ethernet device emulation (using HDLC or Frame-Relay PVC) is compatible
with IEEE 802.1Q (VLANs) and 802.1D (Ethernet bridging).
Make sure the hdlc.o and the hardware driver are loaded. It should
create a number of "hdlc" (hdlc0 etc) network devices, one for each
WAN port. You'll need the "sethdlc" utility, get it from:
http://www.kernel.org/pub/linux/utils/net/hdlc/
Compile sethdlc.c utility:
gcc -O2 -Wall -o sethdlc sethdlc.c
Make sure you're using a correct version of sethdlc for your kernel.
Use sethdlc to set physical interface, clock rate, HDLC mode used,
and add any required PVCs if using Frame Relay.
Usually you want something like:
sethdlc hdlc0 clock int rate 128000
sethdlc hdlc0 cisco interval 10 timeout 25
or
sethdlc hdlc0 rs232 clock ext
sethdlc hdlc0 fr lmi ansi
sethdlc hdlc0 create 99
ifconfig hdlc0 up
ifconfig pvc0 localIP pointopoint remoteIP
In Frame Relay mode, ifconfig master hdlc device up (without assigning
any IP address to it) before using pvc devices.
Setting interface:
* v35 | rs232 | x21 | t1 | e1 - sets physical interface for a given port
if the card has software-selectable interfaces
loopback - activate hardware loopback (for testing only)
* clock ext - both RX clock and TX clock external
* clock int - both RX clock and TX clock internal
* clock txint - RX clock external, TX clock internal
* clock txfromrx - RX clock external, TX clock derived from RX clock
* rate - sets clock rate in bps (for "int" or "txint" clock only)
Setting protocol:
* hdlc - sets raw HDLC (IP-only) mode
nrz / nrzi / fm-mark / fm-space / manchester - sets transmission code
no-parity / crc16 / crc16-pr0 (CRC16 with preset zeros) / crc32-itu
crc16-itu (CRC16 with ITU-T polynomial) / crc16-itu-pr0 - sets parity
* hdlc-eth - Ethernet device emulation using HDLC. Parity and encoding
as above.
* cisco - sets Cisco HDLC mode (IP, IPv6 and IPX supported)
interval - time in seconds between keepalive packets
timeout - time in seconds after last received keepalive packet before
we assume the link is down
* ppp - sets synchronous PPP mode
* x25 - sets X.25 mode
* fr - Frame Relay mode
lmi ansi / ccitt / cisco / none - LMI (link management) type
dce - Frame Relay DCE (network) side LMI instead of default DTE (user).
It has nothing to do with clocks!
t391 - link integrity verification polling timer (in seconds) - user
t392 - polling verification timer (in seconds) - network
n391 - full status polling counter - user
n392 - error threshold - both user and network
n393 - monitored events count - both user and network
Frame-Relay only:
* create n | delete n - adds / deletes PVC interface with DLCI #n.
Newly created interface will be named pvc0, pvc1 etc.
* create ether n | delete ether n - adds a device for Ethernet-bridged
frames. The device will be named pvceth0, pvceth1 etc.
Board-specific issues
---------------------
n2.o and c101.o need parameters to work:
insmod n2 hw=io,irq,ram,ports[:io,irq,...]
example:
insmod n2 hw=0x300,10,0xD0000,01
or
insmod c101 hw=irq,ram[:irq,...]
example:
insmod c101 hw=9,0xdc000
If built into the kernel, these drivers need kernel (command line) parameters:
n2.hw=io,irq,ram,ports:...
or
c101.hw=irq,ram:...
If you have a problem with N2, C101 or PLX200SYN card, you can issue the
"private" command to see port's packet descriptor rings (in kernel logs):
sethdlc hdlc0 private
The hardware driver has to be build with #define DEBUG_RINGS.
Attaching this info to bug reports would be helpful. Anyway, let me know
if you have problems using this.
For patches and other info look at:
<http://www.kernel.org/pub/linux/utils/net/hdlc/>.

View file

@ -0,0 +1,3 @@
A wiki document on how to use Generic Netlink can be found here:
* http://www.linuxfoundation.org/collaborate/workgroups/networking/generic_netlink_howto

View file

@ -0,0 +1,42 @@
The Gianfar Ethernet Driver
Author: Andy Fleming <afleming@freescale.com>
Updated: 2005-07-28
CHECKSUM OFFLOADING
The eTSEC controller (first included in parts from late 2005 like
the 8548) has the ability to perform TCP, UDP, and IP checksums
in hardware. The Linux kernel only offloads the TCP and UDP
checksums (and always performs the pseudo header checksums), so
the driver only supports checksumming for TCP/IP and UDP/IP
packets. Use ethtool to enable or disable this feature for RX
and TX.
VLAN
In order to use VLAN, please consult Linux documentation on
configuring VLANs. The gianfar driver supports hardware insertion and
extraction of VLAN headers, but not filtering. Filtering will be
done by the kernel.
MULTICASTING
The gianfar driver supports using the group hash table on the
TSEC (and the extended hash table on the eTSEC) for multicast
filtering. On the eTSEC, the exact-match MAC registers are used
before the hash tables. See Linux documentation on how to join
multicast groups.
PADDING
The gianfar driver supports padding received frames with 2 bytes
to align the IP header to a 16-byte boundary, when supported by
hardware.
ETHTOOL
The gianfar driver supports the use of ethtool for many
configuration options. You must run ethtool only on currently
open interfaces. See ethtool documentation for details.

View file

@ -0,0 +1,118 @@
Linux Base Driver for the Intel(R) Ethernet Controller XL710 Family
===================================================================
Intel i40e Linux driver.
Copyright(c) 2013 Intel Corporation.
Contents
========
- Identifying Your Adapter
- Additional Configurations
- Performance Tuning
- Known Issues
- Support
Identifying Your Adapter
========================
The driver in this release is compatible with the Intel Ethernet
Controller XL710 Family.
For more information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/network/sb/CS-012904.htm
Enabling the driver
===================
The driver is enabled via the standard kernel configuration system,
using the make command:
Make oldconfig/silentoldconfig/menuconfig/etc.
The driver is located in the menu structure at:
-> Device Drivers
-> Network device support (NETDEVICES [=y])
-> Ethernet driver support
-> Intel devices
-> Intel(R) Ethernet Controller XL710 Family
Additional Configurations
=========================
Generic Receive Offload (GRO)
-----------------------------
The driver supports the in-kernel software implementation of GRO. GRO has
shown that by coalescing Rx traffic into larger chunks of data, CPU
utilization can be significantly reduced when under large Rx load. GRO is
an evolution of the previously-used LRO interface. GRO is able to coalesce
other protocols besides TCP. It's also safe to use with configurations that
are problematic for LRO, namely bridging and iSCSI.
Ethtool
-------
The driver utilizes the ethtool interface for driver configuration and
diagnostics, as well as displaying statistical information. The latest
ethtool version is required for this functionality.
The latest release of ethtool can be found from
https://www.kernel.org/pub/software/network/ethtool
Data Center Bridging (DCB)
--------------------------
DCB configuration is not currently supported.
FCoE
----
The driver supports Fiber Channel over Ethernet (FCoE) and Data Center
Bridging (DCB) functionality. Configuring DCB and FCoE is outside the scope
of this driver doc. Refer to http://www.open-fcoe.org/ for FCoE project
information and http://www.open-lldp.org/ or email list
e1000-eedc@lists.sourceforge.net for DCB information.
MAC and VLAN anti-spoofing feature
----------------------------------
When a malicious driver attempts to send a spoofed packet, it is dropped by
the hardware and not transmitted. An interrupt is sent to the PF driver
notifying it of the spoof attempt.
When a spoofed packet is detected the PF driver will send the following
message to the system log (displayed by the "dmesg" command):
Spoof event(s) detected on VF (n)
Where n=the VF that attempted to do the spoofing.
Performance Tuning
==================
An excellent article on performance tuning can be found at:
http://www.redhat.com/promo/summit/2008/downloads/pdf/Thursday/Mark_Wagner.pdf
Known Issues
============
Support
=======
For general information, go to the Intel support website at:
http://support.intel.com
or the Intel Wired Networking project hosted by Sourceforge at:
http://e1000.sourceforge.net
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related
to the issue to e1000-devel@lists.sourceforge.net and copy
netdev@vger.kernel.org.

View file

@ -0,0 +1,47 @@
Linux* Base Driver for Intel(R) Network Connection
==================================================
Intel XL710 X710 Virtual Function Linux driver.
Copyright(c) 2013 Intel Corporation.
Contents
========
- Identifying Your Adapter
- Known Issues/Troubleshooting
- Support
This file describes the i40evf Linux* Base Driver for the Intel(R) XL710
X710 Virtual Function.
The i40evf driver supports XL710 and X710 virtual function devices that
can only be activated on kernels with CONFIG_PCI_IOV enabled.
The guest OS loading the i40evf driver must support MSI-X interrupts.
Identifying Your Adapter
========================
For more information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/go/network/adapter/idguide.htm
Known Issues/Troubleshooting
============================
Support
=======
For general information, go to the Intel support website at:
http://support.intel.com
or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related
to the issue to e1000-devel@lists.sf.net

View file

@ -0,0 +1,151 @@
Linux IEEE 802.15.4 implementation
Introduction
============
The IEEE 802.15.4 working group focuses on standardization of bottom
two layers: Medium Access Control (MAC) and Physical (PHY). And there
are mainly two options available for upper layers:
- ZigBee - proprietary protocol from ZigBee Alliance
- 6LowPAN - IPv6 networking over low rate personal area networks
The Linux-ZigBee project goal is to provide complete implementation
of IEEE 802.15.4 and 6LoWPAN protocols. IEEE 802.15.4 is a stack
of protocols for organizing Low-Rate Wireless Personal Area Networks.
The stack is composed of three main parts:
- IEEE 802.15.4 layer; We have chosen to use plain Berkeley socket API,
the generic Linux networking stack to transfer IEEE 802.15.4 messages
and a special protocol over genetlink for configuration/management
- MAC - provides access to shared channel and reliable data delivery
- PHY - represents device drivers
Socket API
==========
int sd = socket(PF_IEEE802154, SOCK_DGRAM, 0);
.....
The address family, socket addresses etc. are defined in the
include/net/af_ieee802154.h header or in the special header
in our userspace package (see either linux-zigbee sourceforge download page
or git tree at git://linux-zigbee.git.sourceforge.net/gitroot/linux-zigbee).
One can use SOCK_RAW for passing raw data towards device xmit function. YMMV.
Kernel side
=============
Like with WiFi, there are several types of devices implementing IEEE 802.15.4.
1) 'HardMAC'. The MAC layer is implemented in the device itself, the device
exports MLME and data API.
2) 'SoftMAC' or just radio. These types of devices are just radio transceivers
possibly with some kinds of acceleration like automatic CRC computation and
comparation, automagic ACK handling, address matching, etc.
Those types of devices require different approach to be hooked into Linux kernel.
MLME - MAC Level Management
============================
Most of IEEE 802.15.4 MLME interfaces are directly mapped on netlink commands.
See the include/net/nl802154.h header. Our userspace tools package
(see above) provides CLI configuration utility for radio interfaces and simple
coordinator for IEEE 802.15.4 networks as an example users of MLME protocol.
HardMAC
=======
See the header include/net/ieee802154_netdev.h. You have to implement Linux
net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family
code via plain sk_buffs. On skb reception skb->cb must contain additional
info as described in the struct ieee802154_mac_cb. During packet transmission
the skb->cb is used to provide additional data to device's header_ops->create
function. Be aware that this data can be overridden later (when socket code
submits skb to qdisc), so if you need something from that cb later, you should
store info in the skb->data on your own.
To hook the MLME interface you have to populate the ml_priv field of your
net_device with a pointer to struct ieee802154_mlme_ops instance. The fields
assoc_req, assoc_resp, disassoc_req, start_req, and scan_req are optional.
All other fields are required.
We provide an example of simple HardMAC driver at drivers/ieee802154/fakehard.c
SoftMAC
=======
The MAC is the middle layer in the IEEE 802.15.4 Linux stack. This moment it
provides interface for drivers registration and management of slave interfaces.
NOTE: Currently the only monitor device type is supported - it's IEEE 802.15.4
stack interface for network sniffers (e.g. WireShark).
This layer is going to be extended soon.
See header include/net/mac802154.h and several drivers in drivers/ieee802154/.
Device drivers API
==================
The include/net/mac802154.h defines following functions:
- struct ieee802154_dev *ieee802154_alloc_device
(size_t priv_size, struct ieee802154_ops *ops):
allocation of IEEE 802.15.4 compatible device
- void ieee802154_free_device(struct ieee802154_dev *dev):
freeing allocated device
- int ieee802154_register_device(struct ieee802154_dev *dev):
register PHY in the system
- void ieee802154_unregister_device(struct ieee802154_dev *dev):
freeing registered PHY
Moreover IEEE 802.15.4 device operations structure should be filled.
Fake drivers
============
In addition there are two drivers available which simulate real devices with
HardMAC (fakehard) and SoftMAC (fakelb - IEEE 802.15.4 loopback driver)
interfaces. This option provides possibility to test and debug stack without
usage of real hardware.
See sources in drivers/ieee802154 folder for more details.
6LoWPAN Linux implementation
============================
The IEEE 802.15.4 standard specifies an MTU of 128 bytes, yielding about 80
octets of actual MAC payload once security is turned on, on a wireless link
with a link throughput of 250 kbps or less. The 6LoWPAN adaptation format
[RFC4944] was specified to carry IPv6 datagrams over such constrained links,
taking into account limited bandwidth, memory, or energy resources that are
expected in applications such as wireless Sensor Networks. [RFC4944] defines
a Mesh Addressing header to support sub-IP forwarding, a Fragmentation header
to support the IPv6 minimum MTU requirement [RFC2460], and stateless header
compression for IPv6 datagrams (LOWPAN_HC1 and LOWPAN_HC2) to reduce the
relatively large IPv6 and UDP headers down to (in the best case) several bytes.
In Semptember 2011 the standard update was published - [RFC6282].
It deprecates HC1 and HC2 compression and defines IPHC encoding format which is
used in this Linux implementation.
All the code related to 6lowpan you may find in files: net/ieee802154/6lowpan.*
To setup 6lowpan interface you need (busybox release > 1.17.0):
1. Add IEEE802.15.4 interface and initialize PANid;
2. Add 6lowpan interface by command like:
# ip link add link wpan0 name lowpan0 type lowpan
3. Set MAC (if needs):
# ip link set lowpan0 address de:ad:be:ef:ca:fe:ba:be
4. Bring up 'lowpan0' interface

View file

@ -0,0 +1,129 @@
Linux* Base Driver for Intel(R) Ethernet Network Connection
===========================================================
Intel Gigabit Linux driver.
Copyright(c) 1999 - 2013 Intel Corporation.
Contents
========
- Identifying Your Adapter
- Additional Configurations
- Support
Identifying Your Adapter
========================
This driver supports all 82575, 82576 and 82580-based Intel (R) gigabit network
connections.
For specific information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/go/network/adapter/idguide.htm
Command Line Parameters
=======================
The default value for each parameter is generally the recommended setting,
unless otherwise noted.
max_vfs
-------
Valid Range: 0-7
Default Value: 0
This parameter adds support for SR-IOV. It causes the driver to spawn up to
max_vfs worth of virtual function.
Additional Configurations
=========================
Jumbo Frames
------------
Jumbo Frames support is enabled by changing the MTU to a value larger than
the default of 1500. Use the ifconfig command to increase the MTU size.
For example:
ifconfig eth<x> mtu 9000 up
This setting is not saved across reboots.
Notes:
- The maximum MTU setting for Jumbo Frames is 9216. This value coincides
with the maximum Jumbo Frames size of 9234 bytes.
- Using Jumbo frames at 10 or 100 Mbps is not supported and may result in
poor performance or loss of link.
ethtool
-------
The driver utilizes the ethtool interface for driver configuration and
diagnostics, as well as displaying statistical information. The latest
version of ethtool can be found at:
http://ftp.kernel.org/pub/software/network/ethtool/
Enabling Wake on LAN* (WoL)
---------------------------
WoL is configured through the ethtool* utility.
For instructions on enabling WoL with ethtool, refer to the ethtool man page.
WoL will be enabled on the system during the next shut down or reboot.
For this driver version, in order to enable WoL, the igb driver must be
loaded when shutting down or rebooting the system.
Wake On LAN is only supported on port A of multi-port adapters.
Wake On LAN is not supported for the Intel(R) Gigabit VT Quad Port Server
Adapter.
Multiqueue
----------
In this mode, a separate MSI-X vector is allocated for each queue and one
for "other" interrupts such as link status change and errors. All
interrupts are throttled via interrupt moderation. Interrupt moderation
must be used to avoid interrupt storms while the driver is processing one
interrupt. The moderation value should be at least as large as the expected
time for the driver to process an interrupt. Multiqueue is off by default.
REQUIREMENTS: MSI-X support is required for Multiqueue. If MSI-X is not
found, the system will fallback to MSI or to Legacy interrupts.
MAC and VLAN anti-spoofing feature
----------------------------------
When a malicious driver attempts to send a spoofed packet, it is dropped by
the hardware and not transmitted. An interrupt is sent to the PF driver
notifying it of the spoof attempt.
When a spoofed packet is detected the PF driver will send the following
message to the system log (displayed by the "dmesg" command):
Spoof event(s) detected on VF(n)
Where n=the VF that attempted to do the spoofing.
Setting MAC Address, VLAN and Rate Limit Using IProute2 Tool
------------------------------------------------------------
You can set a MAC address of a Virtual Function (VF), a default VLAN and the
rate limit using the IProute2 tool. Download the latest version of the
iproute2 tool from Sourceforge if your version does not have all the
features you require.
Support
=======
For general information, go to the Intel support website at:
www.intel.com/support/
or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related
to the issue to e1000-devel@lists.sf.net

View file

@ -0,0 +1,80 @@
Linux* Base Driver for Intel(R) Ethernet Network Connection
===========================================================
Intel Gigabit Linux driver.
Copyright(c) 1999 - 2013 Intel Corporation.
Contents
========
- Identifying Your Adapter
- Additional Configurations
- Support
This file describes the igbvf Linux* Base Driver for Intel Network Connection.
The igbvf driver supports 82576-based virtual function devices that can only
be activated on kernels that support SR-IOV. SR-IOV requires the correct
platform and OS support.
The igbvf driver requires the igb driver, version 2.0 or later. The igbvf
driver supports virtual functions generated by the igb driver with a max_vfs
value of 1 or greater. For more information on the max_vfs parameter refer
to the README included with the igb driver.
The guest OS loading the igbvf driver must support MSI-X interrupts.
This driver is only supported as a loadable module at this time. Intel is
not supplying patches against the kernel source to allow for static linking
of the driver. For questions related to hardware requirements, refer to the
documentation supplied with your Intel Gigabit adapter. All hardware
requirements listed apply to use with Linux.
Instructions on updating ethtool can be found in the section "Additional
Configurations" later in this document.
VLANs: There is a limit of a total of 32 shared VLANs to 1 or more VFs.
Identifying Your Adapter
========================
The igbvf driver supports 82576-based virtual function devices that can only
be activated on kernels that support SR-IOV.
For more information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/go/network/adapter/idguide.htm
For the latest Intel network drivers for Linux, refer to the following
website. In the search field, enter your adapter name or type, or use the
networking link on the left to search for your adapter:
http://downloadcenter.intel.com/scripts-df-external/Support_Intel.aspx
Additional Configurations
=========================
ethtool
-------
The driver utilizes the ethtool interface for driver configuration and
diagnostics, as well as displaying statistical information. The ethtool
version 3.0 or later is required for this functionality, although we
strongly recommend downloading the latest version at:
http://ftp.kernel.org/pub/software/network/ethtool/
Support
=======
For general information, go to the Intel support website at:
http://support.intel.com
or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related
to the issue to e1000-devel@lists.sf.net

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,29 @@
IP dynamic address hack-port v0.03
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This stuff allows diald ONESHOT connections to get established by
dynamically changing packet source address (and socket's if local procs).
It is implemented for TCP diald-box connections(1) and IP_MASQuerading(2).
If enabled[*] and forwarding interface has changed:
1) Socket (and packet) source address is rewritten ON RETRANSMISSIONS
while in SYN_SENT state (diald-box processes).
2) Out-bounded MASQueraded source address changes ON OUTPUT (when
internal host does retransmission) until a packet from outside is
received by the tunnel.
This is specially helpful for auto dialup links (diald), where the
``actual'' outgoing address is unknown at the moment the link is
going up. So, the *same* (local AND masqueraded) connections requests that
bring the link up will be able to get established.
[*] At boot, by default no address rewriting is attempted.
To enable:
# echo 1 > /proc/sys/net/ipv4/ip_dynaddr
To enable verbose mode:
# echo 2 > /proc/sys/net/ipv4/ip_dynaddr
To disable (default)
# echo 0 > /proc/sys/net/ipv4/ip_dynaddr
Enjoy!
-- Juanjo <jjciarla@raiz.uncu.edu.ar>

View file

@ -0,0 +1,73 @@
Text file for ipddp.c:
AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation
This text file is written by Jay Schulist <jschlst@samba.org>
Introduction
------------
AppleTalk-IP (IPDDP) is the method computers connected to AppleTalk
networks can use to communicate via IP. AppleTalk-IP is simply IP datagrams
inside AppleTalk packets.
Through this driver you can either allow your Linux box to communicate
IP over an AppleTalk network or you can provide IP gatewaying functions
for your AppleTalk users.
You can currently encapsulate or decapsulate AppleTalk-IP on LocalTalk,
EtherTalk and PPPTalk. The only limit on the protocol is that of what
kernel AppleTalk layer and drivers are available.
Each mode requires its own user space software.
Compiling AppleTalk-IP Decapsulation/Encapsulation
=================================================
AppleTalk-IP decapsulation needs to be compiled into your kernel. You
will need to turn on AppleTalk-IP driver support. Then you will need to
select ONE of the two options; IP to AppleTalk-IP encapsulation support or
AppleTalk-IP to IP decapsulation support. If you compile the driver
statically you will only be able to use the driver for the function you have
enabled in the kernel. If you compile the driver as a module you can
select what mode you want it to run in via a module loading param.
ipddp_mode=1 for AppleTalk-IP encapsulation and ipddp_mode=2 for
AppleTalk-IP to IP decapsulation.
Basic instructions for user space tools
=======================================
I will briefly describe the operation of the tools, but you will
need to consult the supporting documentation for each set of tools.
Decapsulation - You will need to download a software package called
MacGate. In this distribution there will be a tool called MacRoute
which enables you to add routes to the kernel for your Macs by hand.
Also the tool MacRegGateWay is included to register the
proper IP Gateway and IP addresses for your machine. Included in this
distribution is a patch to netatalk-1.4b2+asun2.0a17.2 (available from
ftp.u.washington.edu/pub/user-supported/asun/) this patch is optional
but it allows automatic adding and deleting of routes for Macs. (Handy
for locations with large Mac installations)
Encapsulation - You will need to download a software daemon called ipddpd.
This software expects there to be an AppleTalk-IP gateway on the network.
You will also need to add the proper routes to route your Linux box's IP
traffic out the ipddp interface.
Common Uses of ipddp.c
----------------------
Of course AppleTalk-IP decapsulation and encapsulation, but specifically
decapsulation is being used most for connecting LocalTalk networks to
IP networks. Although it has been used on EtherTalk networks to allow
Macs that are only able to tunnel IP over EtherTalk.
Encapsulation has been used to allow a Linux box stuck on a LocalTalk
network to use IP. It should work equally well if you are stuck on an
EtherTalk only network.
Further Assistance
-------------------
You can contact me (Jay Schulist <jschlst@samba.org>) with any
questions regarding decapsulation or encapsulation. Bradford W. Johnson
<johns393@maroon.tc.umn.edu> originally wrote the ipddp.c driver for IP
encapsulation in AppleTalk.

View file

@ -0,0 +1,158 @@
READ ME FISRT
ATM (i)Chip IA Linux Driver Source
--------------------------------------------------------------------------------
Read This Before You Begin!
--------------------------------------------------------------------------------
Description
-----------
This is the README file for the Interphase PCI ATM (i)Chip IA Linux driver
source release.
The features and limitations of this driver are as follows:
- A single VPI (VPI value of 0) is supported.
- Supports 4K VCs for the server board (with 512K control memory) and 1K
VCs for the client board (with 128K control memory).
- UBR, ABR and CBR service categories are supported.
- Only AAL5 is supported.
- Supports setting of PCR on the VCs.
- Multiple adapters in a system are supported.
- All variants of Interphase ATM PCI (i)Chip adapter cards are supported,
including x575 (OC3, control memory 128K , 512K and packet memory 128K,
512K and 1M), x525 (UTP25) and x531 (DS3 and E3). See
http://www.iphase.com/
for details.
- Only x86 platforms are supported.
- SMP is supported.
Before You Start
----------------
Installation
------------
1. Installing the adapters in the system
To install the ATM adapters in the system, follow the steps below.
a. Login as root.
b. Shut down the system and power off the system.
c. Install one or more ATM adapters in the system.
d. Connect each adapter to a port on an ATM switch. The green 'Link'
LED on the front panel of the adapter will be on if the adapter is
connected to the switch properly when the system is powered up.
e. Power on and boot the system.
2. [ Removed ]
3. Rebuild kernel with ABR support
[ a. and b. removed ]
c. Reconfigure the kernel, choose the Interphase ia driver through "make
menuconfig" or "make xconfig".
d. Rebuild the kernel, loadable modules and the atm tools.
e. Install the new built kernel and modules and reboot.
4. Load the adapter hardware driver (ia driver) if it is built as a module
a. Login as root.
b. Change directory to /lib/modules/<kernel-version>/atm.
c. Run "insmod suni.o;insmod iphase.o"
The yellow 'status' LED on the front panel of the adapter will blink
while the driver is loaded in the system.
d. To verify that the 'ia' driver is loaded successfully, run the
following command:
cat /proc/atm/devices
If the driver is loaded successfully, the output of the command will
be similar to the following lines:
Itf Type ESI/"MAC"addr AAL(TX,err,RX,err,drop) ...
0 ia xxxxxxxxx 0 ( 0 0 0 0 0 ) 5 ( 0 0 0 0 0 )
You can also check the system log file /var/log/messages for messages
related to the ATM driver.
5. Ia Driver Configuration
5.1 Configuration of adapter buffers
The (i)Chip boards have 3 different packet RAM size variants: 128K, 512K and
1M. The RAM size decides the number of buffers and buffer size. The default
size and number of buffers are set as following:
Total Rx RAM Tx RAM Rx Buf Tx Buf Rx buf Tx buf
RAM size size size size size cnt cnt
-------- ------ ------ ------ ------ ------ ------
128K 64K 64K 10K 10K 6 6
512K 256K 256K 10K 10K 25 25
1M 512K 512K 10K 10K 51 51
These setting should work well in most environments, but can be
changed by typing the following command:
insmod <IA_DIR>/ia.o IA_RX_BUF=<RX_CNT> IA_RX_BUF_SZ=<RX_SIZE> \
IA_TX_BUF=<TX_CNT> IA_TX_BUF_SZ=<TX_SIZE>
Where:
RX_CNT = number of receive buffers in the range (1-128)
RX_SIZE = size of receive buffers in the range (48-64K)
TX_CNT = number of transmit buffers in the range (1-128)
TX_SIZE = size of transmit buffers in the range (48-64K)
1. Transmit and receive buffer size must be a multiple of 4.
2. Care should be taken so that the memory required for the
transmit and receive buffers is less than or equal to the
total adapter packet memory.
5.2 Turn on ia debug trace
When the ia driver is built with the CONFIG_ATM_IA_DEBUG flag, the driver
can provide more debug trace if needed. There is a bit mask variable,
IADebugFlag, which controls the output of the traces. You can find the bit
map of the IADebugFlag in iphase.h.
The debug trace can be turn on through the insmod command line option, for
example, "insmod iphase.o IADebugFlag=0xffffffff" can turn on all the debug
traces together with loading the driver.
6. Ia Driver Test Using ttcp_atm and PVC
For the PVC setup, the test machines can either be connected back-to-back or
through a switch. If connected through the switch, the switch must be
configured for the PVC(s).
a. For UBR test:
At the test machine intended to receive data, type:
ttcp_atm -r -a -s 0.100
At the other test machine, type:
ttcp_atm -t -a -s 0.100 -n 10000
Run "ttcp_atm -h" to display more options of the ttcp_atm tool.
b. For ABR test:
It is the same as the UBR testing, but with an extra command option:
-Pabr:max_pcr=<xxx>
where:
xxx = the maximum peak cell rate, from 170 - 353207.
This option must be set on both the machines.
c. For CBR test:
It is the same as the UBR testing, but with an extra command option:
-Pcbr:max_pcr=<xxx>
where:
xxx = the maximum peak cell rate, from 170 - 353207.
This option may only be set on the transmit machine.
OUTSTANDING ISSUES
------------------
Contact Information
-------------------
Customer Support:
United States: Telephone: (214) 654-5555
Fax: (214) 654-5500
E-Mail: intouch@iphase.com
Europe: Telephone: 33 (0)1 41 15 44 00
Fax: 33 (0)1 41 15 12 13
World Wide Web: http://www.iphase.com
Anonymous FTP: ftp.iphase.com

View file

@ -0,0 +1,38 @@
Here documents known IPsec corner cases which need to be keep in mind when
deploy various IPsec configuration in real world production environment.
1. IPcomp: Small IP packet won't get compressed at sender, and failed on
policy check on receiver.
Quote from RFC3173:
2.2. Non-Expansion Policy
If the total size of a compressed payload and the IPComp header, as
defined in section 3, is not smaller than the size of the original
payload, the IP datagram MUST be sent in the original non-compressed
form. To clarify: If an IP datagram is sent non-compressed, no
IPComp header is added to the datagram. This policy ensures saving
the decompression processing cycles and avoiding incurring IP
datagram fragmentation when the expanded datagram is larger than the
MTU.
Small IP datagrams are likely to expand as a result of compression.
Therefore, a numeric threshold should be applied before compression,
where IP datagrams of size smaller than the threshold are sent in the
original form without attempting compression. The numeric threshold
is implementation dependent.
Current IPComp implementation is indeed by the book, while as in practice
when sending non-compressed packet to the peer(whether or not packet len
is smaller than the threshold or the compressed len is large than original
packet len), the packet is dropped when checking the policy as this packet
matches the selector but not coming from any XFRM layer, i.e., with no
security path. Such naked packet will not eventually make it to upper layer.
The result is much more wired to the user when ping peer with different
payload length.
One workaround is try to set "level use" for each policy if user observed
above scenario. The consequence of doing so is small packet(uncompressed)
will skip policy checking on receiver side.

View file

@ -0,0 +1,72 @@
Options for the ipv6 module are supplied as parameters at load time.
Module options may be given as command line arguments to the insmod
or modprobe command, but are usually specified in either
/etc/modules.d/*.conf configuration files, or in a distro-specific
configuration file.
The available ipv6 module parameters are listed below. If a parameter
is not specified the default value is used.
The parameters are as follows:
disable
Specifies whether to load the IPv6 module, but disable all
its functionality. This might be used when another module
has a dependency on the IPv6 module being loaded, but no
IPv6 addresses or operations are desired.
The possible values and their effects are:
0
IPv6 is enabled.
This is the default value.
1
IPv6 is disabled.
No IPv6 addresses will be added to interfaces, and
it will not be possible to open an IPv6 socket.
A reboot is required to enable IPv6.
autoconf
Specifies whether to enable IPv6 address autoconfiguration
on all interfaces. This might be used when one does not wish
for addresses to be automatically generated from prefixes
received in Router Advertisements.
The possible values and their effects are:
0
IPv6 address autoconfiguration is disabled on all interfaces.
Only the IPv6 loopback address (::1) and link-local addresses
will be added to interfaces.
1
IPv6 address autoconfiguration is enabled on all interfaces.
This is the default value.
disable_ipv6
Specifies whether to disable IPv6 on all interfaces.
This might be used when no IPv6 addresses are desired.
The possible values and their effects are:
0
IPv6 is enabled on all interfaces.
This is the default value.
1
IPv6 is disabled on all interfaces.
No IPv6 addresses will be added to interfaces.

View file

@ -0,0 +1,211 @@
/proc/sys/net/ipv4/vs/* Variables:
am_droprate - INTEGER
default 10
It sets the always mode drop rate, which is used in the mode 3
of the drop_rate defense.
amemthresh - INTEGER
default 1024
It sets the available memory threshold (in pages), which is
used in the automatic modes of defense. When there is no
enough available memory, the respective strategy will be
enabled and the variable is automatically set to 2, otherwise
the strategy is disabled and the variable is set to 1.
backup_only - BOOLEAN
0 - disabled (default)
not 0 - enabled
If set, disable the director function while the server is
in backup mode to avoid packet loops for DR/TUN methods.
conntrack - BOOLEAN
0 - disabled (default)
not 0 - enabled
If set, maintain connection tracking entries for
connections handled by IPVS.
This should be enabled if connections handled by IPVS are to be
also handled by stateful firewall rules. That is, iptables rules
that make use of connection tracking. It is a performance
optimisation to disable this setting otherwise.
Connections handled by the IPVS FTP application module
will have connection tracking entries regardless of this setting.
Only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled.
cache_bypass - BOOLEAN
0 - disabled (default)
not 0 - enabled
If it is enabled, forward packets to the original destination
directly when no cache server is available and destination
address is not local (iph->daddr is RTN_UNICAST). It is mostly
used in transparent web cache cluster.
debug_level - INTEGER
0 - transmission error messages (default)
1 - non-fatal error messages
2 - configuration
3 - destination trash
4 - drop entry
5 - service lookup
6 - scheduling
7 - connection new/expire, lookup and synchronization
8 - state transition
9 - binding destination, template checks and applications
10 - IPVS packet transmission
11 - IPVS packet handling (ip_vs_in/ip_vs_out)
12 or more - packet traversal
Only available when IPVS is compiled with CONFIG_IP_VS_DEBUG enabled.
Higher debugging levels include the messages for lower debugging
levels, so setting debug level 2, includes level 0, 1 and 2
messages. Thus, logging becomes more and more verbose the higher
the level.
drop_entry - INTEGER
0 - disabled (default)
The drop_entry defense is to randomly drop entries in the
connection hash table, just in order to collect back some
memory for new connections. In the current code, the
drop_entry procedure can be activated every second, then it
randomly scans 1/32 of the whole and drops entries that are in
the SYN-RECV/SYNACK state, which should be effective against
syn-flooding attack.
The valid values of drop_entry are from 0 to 3, where 0 means
that this strategy is always disabled, 1 and 2 mean automatic
modes (when there is no enough available memory, the strategy
is enabled and the variable is automatically set to 2,
otherwise the strategy is disabled and the variable is set to
1), and 3 means that that the strategy is always enabled.
drop_packet - INTEGER
0 - disabled (default)
The drop_packet defense is designed to drop 1/rate packets
before forwarding them to real servers. If the rate is 1, then
drop all the incoming packets.
The value definition is the same as that of the drop_entry. In
the automatic mode, the rate is determined by the follow
formula: rate = amemthresh / (amemthresh - available_memory)
when available memory is less than the available memory
threshold. When the mode 3 is set, the always mode drop rate
is controlled by the /proc/sys/net/ipv4/vs/am_droprate.
expire_nodest_conn - BOOLEAN
0 - disabled (default)
not 0 - enabled
The default value is 0, the load balancer will silently drop
packets when its destination server is not available. It may
be useful, when user-space monitoring program deletes the
destination server (because of server overload or wrong
detection) and add back the server later, and the connections
to the server can continue.
If this feature is enabled, the load balancer will expire the
connection immediately when a packet arrives and its
destination server is not available, then the client program
will be notified that the connection is closed. This is
equivalent to the feature some people requires to flush
connections when its destination is not available.
expire_quiescent_template - BOOLEAN
0 - disabled (default)
not 0 - enabled
When set to a non-zero value, the load balancer will expire
persistent templates when the destination server is quiescent.
This may be useful, when a user makes a destination server
quiescent by setting its weight to 0 and it is desired that
subsequent otherwise persistent connections are sent to a
different destination server. By default new persistent
connections are allowed to quiescent destination servers.
If this feature is enabled, the load balancer will expire the
persistence template if it is to be used to schedule a new
connection and the destination server is quiescent.
nat_icmp_send - BOOLEAN
0 - disabled (default)
not 0 - enabled
It controls sending icmp error messages (ICMP_DEST_UNREACH)
for VS/NAT when the load balancer receives packets from real
servers but the connection entries don't exist.
secure_tcp - INTEGER
0 - disabled (default)
The secure_tcp defense is to use a more complicated TCP state
transition table. For VS/NAT, it also delays entering the
TCP ESTABLISHED state until the three way handshake is completed.
The value definition is the same as that of drop_entry and
drop_packet.
sync_threshold - INTEGER
default 3
It sets synchronization threshold, which is the minimum number
of incoming packets that a connection needs to receive before
the connection will be synchronized. A connection will be
synchronized, every time the number of its incoming packets
modulus 50 equals the threshold. The range of the threshold is
from 0 to 49.
snat_reroute - BOOLEAN
0 - disabled
not 0 - enabled (default)
If enabled, recalculate the route of SNATed packets from
realservers so that they are routed as if they originate from the
director. Otherwise they are routed as if they are forwarded by the
director.
If policy routing is in effect then it is possible that the route
of a packet originating from a director is routed differently to a
packet being forwarded by the director.
If policy routing is not in effect then the recalculated route will
always be the same as the original route so it is an optimisation
to disable snat_reroute and avoid the recalculation.
sync_persist_mode - INTEGER
default 0
Controls the synchronisation of connections when using persistence
0: All types of connections are synchronised
1: Attempt to reduce the synchronisation traffic depending on
the connection type. For persistent services avoid synchronisation
for normal connections, do it only for persistence templates.
In such case, for TCP and SCTP it may need enabling sloppy_tcp and
sloppy_sctp flags on backup servers. For non-persistent services
such optimization is not applied, mode 0 is assumed.
sync_version - INTEGER
default 1
The version of the synchronisation protocol used when sending
synchronisation messages.
0 selects the original synchronisation protocol (version 0). This
should be used when sending synchronisation messages to a legacy
system that only understands the original synchronisation protocol.
1 selects the current synchronisation protocol (version 1). This
should be used where possible.
Kernels with this sync_version entry are able to receive messages
of both version 1 and version 2 of the synchronisation protocol.

View file

@ -0,0 +1,10 @@
To use the IrDA protocols within Linux you will need to get a suitable copy
of the IrDA Utilities. More detailed information about these and associated
programs can be found on http://irda.sourceforge.net/
For more information about how to use the IrDA protocol stack, see the
Linux Infrared HOWTO by Werner Heuser <wehe@tuxmobil.org>:
<http://www.tuxmobil.org/Infrared-HOWTO/Infrared-HOWTO.html>
There is an active mailing list for discussing Linux-IrDA matters called
irda-users@lists.sourceforge.net

View file

@ -0,0 +1,433 @@
Linux Base Driver for 10 Gigabit Intel(R) Ethernet Network Connection
=====================================================================
March 14, 2011
Contents
========
- In This Release
- Identifying Your Adapter
- Building and Installation
- Command Line Parameters
- Improving Performance
- Additional Configurations
- Known Issues/Troubleshooting
- Support
In This Release
===============
This file describes the ixgb Linux Base Driver for the 10 Gigabit Intel(R)
Network Connection. This driver includes support for Itanium(R)2-based
systems.
For questions related to hardware requirements, refer to the documentation
supplied with your 10 Gigabit adapter. All hardware requirements listed apply
to use with Linux.
The following features are available in this kernel:
- Native VLANs
- Channel Bonding (teaming)
- SNMP
Channel Bonding documentation can be found in the Linux kernel source:
/Documentation/networking/bonding.txt
The driver information previously displayed in the /proc filesystem is not
supported in this release. Alternatively, you can use ethtool (version 1.6
or later), lspci, and ifconfig to obtain the same information.
Instructions on updating ethtool can be found in the section "Additional
Configurations" later in this document.
Identifying Your Adapter
========================
The following Intel network adapters are compatible with the drivers in this
release:
Controller Adapter Name Physical Layer
---------- ------------ --------------
82597EX Intel(R) PRO/10GbE LR/SR/CX4 10G Base-LR (1310 nm optical fiber)
Server Adapters 10G Base-SR (850 nm optical fiber)
10G Base-CX4(twin-axial copper cabling)
For more information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/network/sb/CS-012904.htm
Building and Installation
=========================
select m for "Intel(R) PRO/10GbE support" located at:
Location:
-> Device Drivers
-> Network device support (NETDEVICES [=y])
-> Ethernet (10000 Mbit) (NETDEV_10000 [=y])
1. make modules && make modules_install
2. Load the module:
    modprobe ixgb <parameter>=<value>
The insmod command can be used if the full
path to the driver module is specified. For example:
insmod /lib/modules/<KERNEL VERSION>/kernel/drivers/net/ixgb/ixgb.ko
With 2.6 based kernels also make sure that older ixgb drivers are
removed from the kernel, before loading the new module:
rmmod ixgb; modprobe ixgb
3. Assign an IP address to the interface by entering the following, where
x is the interface number:
ifconfig ethx <IP_address>
4. Verify that the interface works. Enter the following, where <IP_address>
is the IP address for another machine on the same subnet as the interface
that is being tested:
ping <IP_address>
Command Line Parameters
=======================
If the driver is built as a module, the following optional parameters are
used by entering them on the command line with the modprobe command using
this syntax:
modprobe ixgb [<option>=<VAL1>,<VAL2>,...]
For example, with two 10GbE PCI adapters, entering:
modprobe ixgb TxDescriptors=80,128
loads the ixgb driver with 80 TX resources for the first adapter and 128 TX
resources for the second adapter.
The default value for each parameter is generally the recommended setting,
unless otherwise noted.
FlowControl
Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx)
Default: Read from the EEPROM
If EEPROM is not detected, default is 1
This parameter controls the automatic generation(Tx) and response(Rx) to
Ethernet PAUSE frames. There are hardware bugs associated with enabling
Tx flow control so beware.
RxDescriptors
Valid Range: 64-512
Default Value: 512
This value is the number of receive descriptors allocated by the driver.
Increasing this value allows the driver to buffer more incoming packets.
Each descriptor is 16 bytes. A receive buffer is also allocated for
each descriptor and can be either 2048, 4056, 8192, or 16384 bytes,
depending on the MTU setting. When the MTU size is 1500 or less, the
receive buffer size is 2048 bytes. When the MTU is greater than 1500 the
receive buffer size will be either 4056, 8192, or 16384 bytes. The
maximum MTU size is 16114.
RxIntDelay
Valid Range: 0-65535 (0=off)
Default Value: 72
This value delays the generation of receive interrupts in units of
0.8192 microseconds. Receive interrupt reduction can improve CPU
efficiency if properly tuned for specific network traffic. Increasing
this value adds extra latency to frame reception and can end up
decreasing the throughput of TCP traffic. If the system is reporting
dropped receives, this value may be set too high, causing the driver to
run out of available receive descriptors.
TxDescriptors
Valid Range: 64-4096
Default Value: 256
This value is the number of transmit descriptors allocated by the driver.
Increasing this value allows the driver to queue more transmits. Each
descriptor is 16 bytes.
XsumRX
Valid Range: 0-1
Default Value: 1
A value of '1' indicates that the driver should enable IP checksum
offload for received packets (both UDP and TCP) to the adapter hardware.
Improving Performance
=====================
With the 10 Gigabit server adapters, the default Linux configuration will
very likely limit the total available throughput artificially. There is a set
of configuration changes that, when applied together, will increase the ability
of Linux to transmit and receive data. The following enhancements were
originally acquired from settings published at http://www.spec.org/web99/ for
various submitted results using Linux.
NOTE: These changes are only suggestions, and serve as a starting point for
tuning your network performance.
The changes are made in three major ways, listed in order of greatest effect:
- Use ifconfig to modify the mtu (maximum transmission unit) and the txqueuelen
parameter.
- Use sysctl to modify /proc parameters (essentially kernel tuning)
- Use setpci to modify the MMRBC field in PCI-X configuration space to increase
transmit burst lengths on the bus.
NOTE: setpci modifies the adapter's configuration registers to allow it to read
up to 4k bytes at a time (for transmits). However, for some systems the
behavior after modifying this register may be undefined (possibly errors of
some kind). A power-cycle, hard reset or explicitly setting the e6 register
back to 22 (setpci -d 8086:1a48 e6.b=22) may be required to get back to a
stable configuration.
- COPY these lines and paste them into ixgb_perf.sh:
#!/bin/bash
echo "configuring network performance , edit this file to change the interface
or device ID of 10GbE card"
# set mmrbc to 4k reads, modify only Intel 10GbE device IDs
# replace 1a48 with appropriate 10GbE device's ID installed on the system,
# if needed.
setpci -d 8086:1a48 e6.b=2e
# set the MTU (max transmission unit) - it requires your switch and clients
# to change as well.
# set the txqueuelen
# your ixgb adapter should be loaded as eth1 for this to work, change if needed
ifconfig eth1 mtu 9000 txqueuelen 1000 up
# call the sysctl utility to modify /proc/sys entries
sysctl -p ./sysctl_ixgb.conf
- END ixgb_perf.sh
- COPY these lines and paste them into sysctl_ixgb.conf:
# some of the defaults may be different for your kernel
# call this file with sysctl -p <this file>
# these are just suggested values that worked well to increase throughput in
# several network benchmark tests, your mileage may vary
### IPV4 specific settings
# turn TCP timestamp support off, default 1, reduces CPU use
net.ipv4.tcp_timestamps = 0
# turn SACK support off, default on
# on systems with a VERY fast bus -> memory interface this is the big gainer
net.ipv4.tcp_sack = 0
# set min/default/max TCP read buffer, default 4096 87380 174760
net.ipv4.tcp_rmem = 10000000 10000000 10000000
# set min/pressure/max TCP write buffer, default 4096 16384 131072
net.ipv4.tcp_wmem = 10000000 10000000 10000000
# set min/pressure/max TCP buffer space, default 31744 32256 32768
net.ipv4.tcp_mem = 10000000 10000000 10000000
### CORE settings (mostly for socket and UDP effect)
# set maximum receive socket buffer size, default 131071
net.core.rmem_max = 524287
# set maximum send socket buffer size, default 131071
net.core.wmem_max = 524287
# set default receive socket buffer size, default 65535
net.core.rmem_default = 524287
# set default send socket buffer size, default 65535
net.core.wmem_default = 524287
# set maximum amount of option memory buffers, default 10240
net.core.optmem_max = 524287
# set number of unprocessed input packets before kernel starts dropping them; default 300
net.core.netdev_max_backlog = 300000
- END sysctl_ixgb.conf
Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface
your ixgb driver is using and/or replace '1a48' with appropriate 10GbE device's
ID installed on the system.
NOTE: Unless these scripts are added to the boot process, these changes will
only last only until the next system reboot.
Resolving Slow UDP Traffic
--------------------------
If your server does not seem to be able to receive UDP traffic as fast as it
can receive TCP traffic, it could be because Linux, by default, does not set
the network stack buffers as large as they need to be to support high UDP
transfer rates. One way to alleviate this problem is to allow more memory to
be used by the IP stack to store incoming data.
For instance, use the commands:
sysctl -w net.core.rmem_max=262143
and
sysctl -w net.core.rmem_default=262143
to increase the read buffer memory max and default to 262143 (256k - 1) from
defaults of max=131071 (128k - 1) and default=65535 (64k - 1). These variables
will increase the amount of memory used by the network stack for receives, and
can be increased significantly more if necessary for your application.
Additional Configurations
=========================
Configuring the Driver on Different Distributions
-------------------------------------------------
Configuring a network driver to load properly when the system is started is
distribution dependent. Typically, the configuration process involves adding
an alias line to /etc/modprobe.conf as well as editing other system startup
scripts and/or configuration files. Many popular Linux distributions ship
with tools to make these changes for you. To learn the proper way to
configure a network device for your system, refer to your distribution
documentation. If during this process you are asked for the driver or module
name, the name for the Linux Base Driver for the Intel 10GbE Family of
Adapters is ixgb.
Viewing Link Messages
---------------------
Link messages will not be displayed to the console if the distribution is
restricting system messages. In order to see network driver link messages on
your console, set dmesg to eight by entering the following:
dmesg -n 8
NOTE: This setting is not saved across reboots.
Jumbo Frames
------------
The driver supports Jumbo Frames for all adapters. Jumbo Frames support is
enabled by changing the MTU to a value larger than the default of 1500.
The maximum value for the MTU is 16114. Use the ifconfig command to
increase the MTU size. For example:
ifconfig ethx mtu 9000 up
The maximum MTU setting for Jumbo Frames is 16114. This value coincides
with the maximum Jumbo Frames size of 16128.
ethtool
-------
The driver utilizes the ethtool interface for driver configuration and
diagnostics, as well as displaying statistical information. The ethtool
version 1.6 or later is required for this functionality.
The latest release of ethtool can be found from
http://ftp.kernel.org/pub/software/network/ethtool/
NOTE: The ethtool version 1.6 only supports a limited set of ethtool options.
Support for a more complete ethtool feature set can be enabled by
upgrading to the latest version.
NAPI
----
NAPI (Rx polling mode) is supported in the ixgb driver. NAPI is enabled
or disabled based on the configuration of the kernel. see CONFIG_IXGB_NAPI
See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI.
Known Issues/Troubleshooting
============================
NOTE: After installing the driver, if your Intel Network Connection is not
working, verify in the "In This Release" section of the readme that you have
installed the correct driver.
Intel(R) PRO/10GbE CX4 Server Adapter Cable Interoperability Issue with
Fujitsu XENPAK Module in SmartBits Chassis
---------------------------------------------------------------------
Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4
Server adapter is connected to a Fujitsu XENPAK CX4 module in a SmartBits
chassis using 15 m/24AWG cable assemblies manufactured by Fujitsu or Leoni.
The CRC errors may be received either by the Intel(R) PRO/10GbE CX4
Server adapter or the SmartBits. If this situation occurs using a different
cable assembly may resolve the issue.
CX4 Server Adapter Cable Interoperability Issues with HP Procurve 3400cl
Switch Port
------------------------------------------------------------------------
Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4 Server
adapter is connected to an HP Procurve 3400cl switch port using short cables
(1 m or shorter). If this situation occurs, using a longer cable may resolve
the issue.
Excessive CRC errors may be observed using Fujitsu 24AWG cable assemblies that
Are 10 m or longer or where using a Leoni 15 m/24AWG cable assembly. The CRC
errors may be received either by the CX4 Server adapter or at the switch. If
this situation occurs, using a different cable assembly may resolve the issue.
Jumbo Frames System Requirement
-------------------------------
Memory allocation failures have been observed on Linux systems with 64 MB
of RAM or less that are running Jumbo Frames. If you are using Jumbo
Frames, your system may require more than the advertised minimum
requirement of 64 MB of system memory.
Performance Degradation with Jumbo Frames
-----------------------------------------
Degradation in throughput performance may be observed in some Jumbo frames
environments. If this is observed, increasing the application's socket buffer
size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values may help.
See the specific application manual and /usr/src/linux*/Documentation/
networking/ip-sysctl.txt for more details.
Allocating Rx Buffers when Using Jumbo Frames
---------------------------------------------
Allocating Rx buffers when using Jumbo Frames on 2.6.x kernels may fail if
the available memory is heavily fragmented. This issue may be seen with PCI-X
adapters or with packet split disabled. This can be reduced or eliminated
by changing the amount of available memory for receive buffer allocation, by
increasing /proc/sys/vm/min_free_kbytes.
Multiple Interfaces on Same Ethernet Broadcast Network
------------------------------------------------------
Due to the default ARP behavior on Linux, it is not possible to have
one system on two IP networks in the same Ethernet broadcast domain
(non-partitioned switch) behave as expected. All Ethernet interfaces
will respond to IP traffic for any IP address assigned to the system.
This results in unbalanced receive traffic.
If you have multiple interfaces in a server, do either of the following:
- Turn on ARP filtering by entering:
echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
- Install the interfaces in separate broadcast domains - either in
different switches or in a switch partitioned to VLANs.
UDP Stress Test Dropped Packet Issue
--------------------------------------
Under small packets UDP stress test with 10GbE driver, the Linux system
may drop UDP packets due to the fullness of socket buffers. You may want
to change the driver's Flow Control variables to the minimum value for
controlling packet reception.
Tx Hangs Possible Under Stress
------------------------------
Under stress conditions, if TX hangs occur, turning off TSO
"ethtool -K eth0 tso off" may resolve the problem.
Support
=======
For general information, go to the Intel support website at:
http://support.intel.com
or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related
to the issue to e1000-devel@lists.sf.net

View file

@ -0,0 +1,349 @@
Linux* Base Driver for the Intel(R) Ethernet 10 Gigabit PCI Express Family of
Adapters
=============================================================================
Intel 10 Gigabit Linux driver.
Copyright(c) 1999 - 2013 Intel Corporation.
Contents
========
- Identifying Your Adapter
- Additional Configurations
- Performance Tuning
- Known Issues
- Support
Identifying Your Adapter
========================
The driver in this release is compatible with 82598, 82599 and X540-based
Intel Network Connections.
For more information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/network/sb/CS-012904.htm
SFP+ Devices with Pluggable Optics
----------------------------------
82599-BASED ADAPTERS
NOTES: If your 82599-based Intel(R) Network Adapter came with Intel optics, or
is an Intel(R) Ethernet Server Adapter X520-2, then it only supports Intel
optics and/or the direct attach cables listed below.
When 82599-based SFP+ devices are connected back to back, they should be set to
the same Speed setting via ethtool. Results may vary if you mix speed settings.
82598-based adapters support all passive direct attach cables that comply
with SFF-8431 v4.1 and SFF-8472 v10.4 specifications. Active direct attach
cables are not supported.
Supplier Type Part Numbers
SR Modules
Intel DUAL RATE 1G/10G SFP+ SR (bailed) FTLX8571D3BCV-IT
Intel DUAL RATE 1G/10G SFP+ SR (bailed) AFBR-703SDDZ-IN1
Intel DUAL RATE 1G/10G SFP+ SR (bailed) AFBR-703SDZ-IN2
LR Modules
Intel DUAL RATE 1G/10G SFP+ LR (bailed) FTLX1471D3BCV-IT
Intel DUAL RATE 1G/10G SFP+ LR (bailed) AFCT-701SDDZ-IN1
Intel DUAL RATE 1G/10G SFP+ LR (bailed) AFCT-701SDZ-IN2
The following is a list of 3rd party SFP+ modules and direct attach cables that
have received some testing. Not all modules are applicable to all devices.
Supplier Type Part Numbers
Finisar SFP+ SR bailed, 10g single rate FTLX8571D3BCL
Avago SFP+ SR bailed, 10g single rate AFBR-700SDZ
Finisar SFP+ LR bailed, 10g single rate FTLX1471D3BCL
Finisar DUAL RATE 1G/10G SFP+ SR (No Bail) FTLX8571D3QCV-IT
Avago DUAL RATE 1G/10G SFP+ SR (No Bail) AFBR-703SDZ-IN1
Finisar DUAL RATE 1G/10G SFP+ LR (No Bail) FTLX1471D3QCV-IT
Avago DUAL RATE 1G/10G SFP+ LR (No Bail) AFCT-701SDZ-IN1
Finistar 1000BASE-T SFP FCLF8522P2BTL
Avago 1000BASE-T SFP ABCU-5710RZ
82599-based adapters support all passive and active limiting direct attach
cables that comply with SFF-8431 v4.1 and SFF-8472 v10.4 specifications.
Laser turns off for SFP+ when ifconfig down
-------------------------------------------
"ifconfig down" turns off the laser for 82599-based SFP+ fiber adapters.
"ifconfig up" turns on the laser.
82598-BASED ADAPTERS
NOTES for 82598-Based Adapters:
- Intel(R) Network Adapters that support removable optical modules only support
their original module type (i.e., the Intel(R) 10 Gigabit SR Dual Port
Express Module only supports SR optical modules). If you plug in a different
type of module, the driver will not load.
- Hot Swapping/hot plugging optical modules is not supported.
- Only single speed, 10 gigabit modules are supported.
- LAN on Motherboard (LOMs) may support DA, SR, or LR modules. Other module
types are not supported. Please see your system documentation for details.
The following is a list of 3rd party SFP+ modules and direct attach cables that
have received some testing. Not all modules are applicable to all devices.
Supplier Type Part Numbers
Finisar SFP+ SR bailed, 10g single rate FTLX8571D3BCL
Avago SFP+ SR bailed, 10g single rate AFBR-700SDZ
Finisar SFP+ LR bailed, 10g single rate FTLX1471D3BCL
82598-based adapters support all passive direct attach cables that comply
with SFF-8431 v4.1 and SFF-8472 v10.4 specifications. Active direct attach
cables are not supported.
Flow Control
------------
Ethernet Flow Control (IEEE 802.3x) can be configured with ethtool to enable
receiving and transmitting pause frames for ixgbe. When TX is enabled, PAUSE
frames are generated when the receive packet buffer crosses a predefined
threshold. When rx is enabled, the transmit unit will halt for the time delay
specified when a PAUSE frame is received.
Flow Control is enabled by default. If you want to disable a flow control
capable link partner, use ethtool:
ethtool -A eth? autoneg off RX off TX off
NOTE: For 82598 backplane cards entering 1 gig mode, flow control default
behavior is changed to off. Flow control in 1 gig mode on these devices can
lead to Tx hangs.
Intel(R) Ethernet Flow Director
-------------------------------
Supports advanced filters that direct receive packets by their flows to
different queues. Enables tight control on routing a flow in the platform.
Matches flows and CPU cores for flow affinity. Supports multiple parameters
for flexible flow classification and load balancing.
Flow director is enabled only if the kernel is multiple TX queue capable.
An included script (set_irq_affinity.sh) automates setting the IRQ to CPU
affinity.
You can verify that the driver is using Flow Director by looking at the counter
in ethtool: fdir_miss and fdir_match.
Other ethtool Commands:
To enable Flow Director
ethtool -K ethX ntuple on
To add a filter
Use -U switch. e.g., ethtool -U ethX flow-type tcp4 src-ip 0x178000a
action 1
To see the list of filters currently present:
ethtool -u ethX
Perfect Filter: Perfect filter is an interface to load the filter table that
funnels all flow into queue_0 unless an alternative queue is specified using
"action". In that case, any flow that matches the filter criteria will be
directed to the appropriate queue.
If the queue is defined as -1, filter will drop matching packets.
To account for filter matches and misses, there are two stats in ethtool:
fdir_match and fdir_miss. In addition, rx_queue_N_packets shows the number of
packets processed by the Nth queue.
NOTE: Receive Packet Steering (RPS) and Receive Flow Steering (RFS) are not
compatible with Flow Director. IF Flow Director is enabled, these will be
disabled.
The following three parameters impact Flow Director.
FdirMode
--------
Valid Range: 0-2 (0=off, 1=ATR, 2=Perfect filter mode)
Default Value: 1
Flow Director filtering modes.
FdirPballoc
-----------
Valid Range: 0-2 (0=64k, 1=128k, 2=256k)
Default Value: 0
Flow Director allocated packet buffer size.
AtrSampleRate
--------------
Valid Range: 1-100
Default Value: 20
Software ATR Tx packet sample rate. For example, when set to 20, every 20th
packet, looks to see if the packet will create a new flow.
Node
----
Valid Range: 0-n
Default Value: 1 (off)
0 - n: where n is the number of NUMA nodes (i.e. 0 - 3) currently online in
your system
1: turns this option off
The Node parameter will allow you to pick which NUMA node you want to have
the adapter allocate memory on.
max_vfs
-------
Valid Range: 1-63
Default Value: 0
If the value is greater than 0 it will also force the VMDq parameter to be 1
or more.
This parameter adds support for SR-IOV. It causes the driver to spawn up to
max_vfs worth of virtual function.
Additional Configurations
=========================
Jumbo Frames
------------
The driver supports Jumbo Frames for all adapters. Jumbo Frames support is
enabled by changing the MTU to a value larger than the default of 1500.
The maximum value for the MTU is 16110. Use the ifconfig command to
increase the MTU size. For example:
ifconfig ethx mtu 9000 up
The maximum MTU setting for Jumbo Frames is 16110. This value coincides
with the maximum Jumbo Frames size of 16128.
Generic Receive Offload, aka GRO
--------------------------------
The driver supports the in-kernel software implementation of GRO. GRO has
shown that by coalescing Rx traffic into larger chunks of data, CPU
utilization can be significantly reduced when under large Rx load. GRO is an
evolution of the previously-used LRO interface. GRO is able to coalesce
other protocols besides TCP. It's also safe to use with configurations that
are problematic for LRO, namely bridging and iSCSI.
Data Center Bridging, aka DCB
-----------------------------
DCB is a configuration Quality of Service implementation in hardware.
It uses the VLAN priority tag (802.1p) to filter traffic. That means
that there are 8 different priorities that traffic can be filtered into.
It also enables priority flow control which can limit or eliminate the
number of dropped packets during network stress. Bandwidth can be
allocated to each of these priorities, which is enforced at the hardware
level.
To enable DCB support in ixgbe, you must enable the DCB netlink layer to
allow the userspace tools (see below) to communicate with the driver.
This can be found in the kernel configuration here:
-> Networking support
-> Networking options
-> Data Center Bridging support
Once this is selected, DCB support must be selected for ixgbe. This can
be found here:
-> Device Drivers
-> Network device support (NETDEVICES [=y])
-> Ethernet (10000 Mbit) (NETDEV_10000 [=y])
-> Intel(R) 10GbE PCI Express adapters support
-> Data Center Bridging (DCB) Support
After these options are selected, you must rebuild your kernel and your
modules.
In order to use DCB, userspace tools must be downloaded and installed.
The dcbd tools can be found at:
http://e1000.sf.net
Ethtool
-------
The driver utilizes the ethtool interface for driver configuration and
diagnostics, as well as displaying statistical information. The latest
ethtool version is required for this functionality.
The latest release of ethtool can be found from
http://ftp.kernel.org/pub/software/network/ethtool/
FCoE
----
This release of the ixgbe driver contains new code to enable users to use
Fiber Channel over Ethernet (FCoE) and Data Center Bridging (DCB)
functionality that is supported by the 82598-based hardware. This code has
no default effect on the regular driver operation, and configuring DCB and
FCoE is outside the scope of this driver README. Refer to
http://www.open-fcoe.org/ for FCoE project information and contact
e1000-eedc@lists.sourceforge.net for DCB information.
MAC and VLAN anti-spoofing feature
----------------------------------
When a malicious driver attempts to send a spoofed packet, it is dropped by
the hardware and not transmitted. An interrupt is sent to the PF driver
notifying it of the spoof attempt.
When a spoofed packet is detected the PF driver will send the following
message to the system log (displayed by the "dmesg" command):
Spoof event(s) detected on VF (n)
Where n=the VF that attempted to do the spoofing.
Performance Tuning
==================
An excellent article on performance tuning can be found at:
http://www.redhat.com/promo/summit/2008/downloads/pdf/Thursday/Mark_Wagner.pdf
Known Issues
============
Enabling SR-IOV in a 32-bit or 64-bit Microsoft* Windows* Server 2008/R2
Guest OS using Intel (R) 82576-based GbE or Intel (R) 82599-based 10GbE
controller under KVM
------------------------------------------------------------------------
KVM Hypervisor/VMM supports direct assignment of a PCIe device to a VM. This
includes traditional PCIe devices, as well as SR-IOV-capable devices using
Intel 82576-based and 82599-based controllers.
While direct assignment of a PCIe device or an SR-IOV Virtual Function (VF)
to a Linux-based VM running 2.6.32 or later kernel works fine, there is a
known issue with Microsoft Windows Server 2008 VM that results in a "yellow
bang" error. This problem is within the KVM VMM itself, not the Intel driver,
or the SR-IOV logic of the VMM, but rather that KVM emulates an older CPU
model for the guests, and this older CPU model does not support MSI-X
interrupts, which is a requirement for Intel SR-IOV.
If you wish to use the Intel 82576 or 82599-based controllers in SR-IOV mode
with KVM and a Microsoft Windows Server 2008 guest try the following
workaround. The workaround is to tell KVM to emulate a different model of CPU
when using qemu to create the KVM guest:
"-cpu qemu64,model=13"
Support
=======
For general information, go to the Intel support website at:
http://support.intel.com
or the Intel Wired Networking project hosted by Sourceforge at:
http://e1000.sourceforge.net
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related
to the issue to e1000-devel@lists.sf.net

View file

@ -0,0 +1,52 @@
Linux* Base Driver for Intel(R) Ethernet Network Connection
===========================================================
Intel Gigabit Linux driver.
Copyright(c) 1999 - 2013 Intel Corporation.
Contents
========
- Identifying Your Adapter
- Known Issues/Troubleshooting
- Support
This file describes the ixgbevf Linux* Base Driver for Intel Network
Connection.
The ixgbevf driver supports 82599-based virtual function devices that can only
be activated on kernels with CONFIG_PCI_IOV enabled.
The ixgbevf driver supports virtual functions generated by the ixgbe driver
with a max_vfs value of 1 or greater.
The guest OS loading the ixgbevf driver must support MSI-X interrupts.
VLANs: There is a limit of a total of 32 shared VLANs to 1 or more VFs.
Identifying Your Adapter
========================
For more information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/go/network/adapter/idguide.htm
Known Issues/Troubleshooting
============================
Support
=======
For general information, go to the Intel support website at:
http://support.intel.com
or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related
to the issue to e1000-devel@lists.sf.net

View file

@ -0,0 +1,348 @@
This document describes how to use the kernel's L2TP drivers to
provide L2TP functionality. L2TP is a protocol that tunnels one or
more sessions over an IP tunnel. It is commonly used for VPNs
(L2TP/IPSec) and by ISPs to tunnel subscriber PPP sessions over an IP
network infrastructure. With L2TPv3, it is also useful as a Layer-2
tunneling infrastructure.
Features
========
L2TPv2 (PPP over L2TP (UDP tunnels)).
L2TPv3 ethernet pseudowires.
L2TPv3 PPP pseudowires.
L2TPv3 IP encapsulation.
Netlink sockets for L2TPv3 configuration management.
History
=======
The original pppol2tp driver was introduced in 2.6.23 and provided
L2TPv2 functionality (rfc2661). L2TPv2 is used to tunnel one or more PPP
sessions over a UDP tunnel.
L2TPv3 (rfc3931) changes the protocol to allow different frame types
to be passed over an L2TP tunnel by moving the PPP-specific parts of
the protocol out of the core L2TP packet headers. Each frame type is
known as a pseudowire type. Ethernet, PPP, HDLC, Frame Relay and ATM
pseudowires for L2TP are defined in separate RFC standards. Another
change for L2TPv3 is that it can be carried directly over IP with no
UDP header (UDP is optional). It is also possible to create static
unmanaged L2TPv3 tunnels manually without a control protocol
(userspace daemon) to manage them.
To support L2TPv3, the original pppol2tp driver was split up to
separate the L2TP and PPP functionality. Existing L2TPv2 userspace
apps should be unaffected as the original pppol2tp sockets API is
retained. L2TPv3, however, uses netlink to manage L2TPv3 tunnels and
sessions.
Design
======
The L2TP protocol separates control and data frames. The L2TP kernel
drivers handle only L2TP data frames; control frames are always
handled by userspace. L2TP control frames carry messages between L2TP
clients/servers and are used to setup / teardown tunnels and
sessions. An L2TP client or server is implemented in userspace.
Each L2TP tunnel is implemented using a UDP or L2TPIP socket; L2TPIP
provides L2TPv3 IP encapsulation (no UDP) and is implemented using a
new l2tpip socket family. The tunnel socket is typically created by
userspace, though for unmanaged L2TPv3 tunnels, the socket can also be
created by the kernel. Each L2TP session (pseudowire) gets a network
interface instance. In the case of PPP, these interfaces are created
indirectly by pppd using a pppol2tp socket. In the case of ethernet,
the netdevice is created upon a netlink request to create an L2TPv3
ethernet pseudowire.
For PPP, the PPPoL2TP driver, net/l2tp/l2tp_ppp.c, provides a
mechanism by which PPP frames carried through an L2TP session are
passed through the kernel's PPP subsystem. The standard PPP daemon,
pppd, handles all PPP interaction with the peer. PPP network
interfaces are created for each local PPP endpoint. The kernel's PPP
subsystem arranges for PPP control frames to be delivered to pppd,
while data frames are forwarded as usual.
For ethernet, the L2TPETH driver, net/l2tp/l2tp_eth.c, implements a
netdevice driver, managing virtual ethernet devices, one per
pseudowire. These interfaces can be managed using standard Linux tools
such as "ip" and "ifconfig". If only IP frames are passed over the
tunnel, the interface can be given an IP addresses of itself and its
peer. If non-IP frames are to be passed over the tunnel, the interface
can be added to a bridge using brctl. All L2TP datapath protocol
functions are handled by the L2TP core driver.
Each tunnel and session within a tunnel is assigned a unique tunnel_id
and session_id. These ids are carried in the L2TP header of every
control and data packet. (Actually, in L2TPv3, the tunnel_id isn't
present in data frames - it is inferred from the IP connection on
which the packet was received.) The L2TP driver uses the ids to lookup
internal tunnel and/or session contexts to determine how to handle the
packet. Zero tunnel / session ids are treated specially - zero ids are
never assigned to tunnels or sessions in the network. In the driver,
the tunnel context keeps a reference to the tunnel UDP or L2TPIP
socket. The session context holds data that lets the driver interface
to the kernel's network frame type subsystems, i.e. PPP, ethernet.
Userspace Programming
=====================
For L2TPv2, there are a number of requirements on the userspace L2TP
daemon in order to use the pppol2tp driver.
1. Use a UDP socket per tunnel.
2. Create a single PPPoL2TP socket per tunnel bound to a special null
session id. This is used only for communicating with the driver but
must remain open while the tunnel is active. Opening this tunnel
management socket causes the driver to mark the tunnel socket as an
L2TP UDP encapsulation socket and flags it for use by the
referenced tunnel id. This hooks up the UDP receive path via
udp_encap_rcv() in net/ipv4/udp.c. PPP data frames are never passed
in this special PPPoX socket.
3. Create a PPPoL2TP socket per L2TP session. This is typically done
by starting pppd with the pppol2tp plugin and appropriate
arguments. A PPPoL2TP tunnel management socket (Step 2) must be
created before the first PPPoL2TP session socket is created.
When creating PPPoL2TP sockets, the application provides information
to the driver about the socket in a socket connect() call. Source and
destination tunnel and session ids are provided, as well as the file
descriptor of a UDP socket. See struct pppol2tp_addr in
include/linux/if_pppol2tp.h. Note that zero tunnel / session ids are
treated specially. When creating the per-tunnel PPPoL2TP management
socket in Step 2 above, zero source and destination session ids are
specified, which tells the driver to prepare the supplied UDP file
descriptor for use as an L2TP tunnel socket.
Userspace may control behavior of the tunnel or session using
setsockopt and ioctl on the PPPoX socket. The following socket
options are supported:-
DEBUG - bitmask of debug message categories. See below.
SENDSEQ - 0 => don't send packets with sequence numbers
1 => send packets with sequence numbers
RECVSEQ - 0 => receive packet sequence numbers are optional
1 => drop receive packets without sequence numbers
LNSMODE - 0 => act as LAC.
1 => act as LNS.
REORDERTO - reorder timeout (in millisecs). If 0, don't try to reorder.
Only the DEBUG option is supported by the special tunnel management
PPPoX socket.
In addition to the standard PPP ioctls, a PPPIOCGL2TPSTATS is provided
to retrieve tunnel and session statistics from the kernel using the
PPPoX socket of the appropriate tunnel or session.
For L2TPv3, userspace must use the netlink API defined in
include/linux/l2tp.h to manage tunnel and session contexts. The
general procedure to create a new L2TP tunnel with one session is:-
1. Open a GENL socket using L2TP_GENL_NAME for configuring the kernel
using netlink.
2. Create a UDP or L2TPIP socket for the tunnel.
3. Create a new L2TP tunnel using a L2TP_CMD_TUNNEL_CREATE
request. Set attributes according to desired tunnel parameters,
referencing the UDP or L2TPIP socket created in the previous step.
4. Create a new L2TP session in the tunnel using a
L2TP_CMD_SESSION_CREATE request.
The tunnel and all of its sessions are closed when the tunnel socket
is closed. The netlink API may also be used to delete sessions and
tunnels. Configuration and status info may be set or read using netlink.
The L2TP driver also supports static (unmanaged) L2TPv3 tunnels. These
are where there is no L2TP control message exchange with the peer to
setup the tunnel; the tunnel is configured manually at each end of the
tunnel. There is no need for an L2TP userspace application in this
case -- the tunnel socket is created by the kernel and configured
using parameters sent in the L2TP_CMD_TUNNEL_CREATE netlink
request. The "ip" utility of iproute2 has commands for managing static
L2TPv3 tunnels; do "ip l2tp help" for more information.
Debugging
=========
The driver supports a flexible debug scheme where kernel trace
messages may be optionally enabled per tunnel and per session. Care is
needed when debugging a live system since the messages are not
rate-limited and a busy system could be swamped. Userspace uses
setsockopt on the PPPoX socket to set a debug mask.
The following debug mask bits are available:
PPPOL2TP_MSG_DEBUG verbose debug (if compiled in)
PPPOL2TP_MSG_CONTROL userspace - kernel interface
PPPOL2TP_MSG_SEQ sequence numbers handling
PPPOL2TP_MSG_DATA data packets
If enabled, files under a l2tp debugfs directory can be used to dump
kernel state about L2TP tunnels and sessions. To access it, the
debugfs filesystem must first be mounted.
# mount -t debugfs debugfs /debug
Files under the l2tp directory can then be accessed.
# cat /debug/l2tp/tunnels
The debugfs files should not be used by applications to obtain L2TP
state information because the file format is subject to change. It is
implemented to provide extra debug information to help diagnose
problems.) Users should use the netlink API.
/proc/net/pppol2tp is also provided for backwards compatibility with
the original pppol2tp driver. It lists information about L2TPv2
tunnels and sessions only. Its use is discouraged.
Unmanaged L2TPv3 Tunnels
========================
Some commercial L2TP products support unmanaged L2TPv3 ethernet
tunnels, where there is no L2TP control protocol; tunnels are
configured at each side manually. New commands are available in
iproute2's ip utility to support this.
To create an L2TPv3 ethernet pseudowire between local host 192.168.1.1
and peer 192.168.1.2, using IP addresses 10.5.1.1 and 10.5.1.2 for the
tunnel endpoints:-
# modprobe l2tp_eth
# modprobe l2tp_netlink
# ip l2tp add tunnel tunnel_id 1 peer_tunnel_id 1 udp_sport 5000 \
udp_dport 5000 encap udp local 192.168.1.1 remote 192.168.1.2
# ip l2tp add session tunnel_id 1 session_id 1 peer_session_id 1
# ifconfig -a
# ip addr add 10.5.1.2/32 peer 10.5.1.1/32 dev l2tpeth0
# ifconfig l2tpeth0 up
Choose IP addresses to be the address of a local IP interface and that
of the remote system. The IP addresses of the l2tpeth0 interface can be
anything suitable.
Repeat the above at the peer, with ports, tunnel/session ids and IP
addresses reversed. The tunnel and session IDs can be any non-zero
32-bit number, but the values must be reversed at the peer.
Host 1 Host2
udp_sport=5000 udp_sport=5001
udp_dport=5001 udp_dport=5000
tunnel_id=42 tunnel_id=45
peer_tunnel_id=45 peer_tunnel_id=42
session_id=128 session_id=5196755
peer_session_id=5196755 peer_session_id=128
When done at both ends of the tunnel, it should be possible to send
data over the network. e.g.
# ping 10.5.1.1
Sample Userspace Code
=====================
1. Create tunnel management PPPoX socket
kernel_fd = socket(AF_PPPOX, SOCK_DGRAM, PX_PROTO_OL2TP);
if (kernel_fd >= 0) {
struct sockaddr_pppol2tp sax;
struct sockaddr_in const *peer_addr;
peer_addr = l2tp_tunnel_get_peer_addr(tunnel);
memset(&sax, 0, sizeof(sax));
sax.sa_family = AF_PPPOX;
sax.sa_protocol = PX_PROTO_OL2TP;
sax.pppol2tp.fd = udp_fd; /* fd of tunnel UDP socket */
sax.pppol2tp.addr.sin_addr.s_addr = peer_addr->sin_addr.s_addr;
sax.pppol2tp.addr.sin_port = peer_addr->sin_port;
sax.pppol2tp.addr.sin_family = AF_INET;
sax.pppol2tp.s_tunnel = tunnel_id;
sax.pppol2tp.s_session = 0; /* special case: mgmt socket */
sax.pppol2tp.d_tunnel = 0;
sax.pppol2tp.d_session = 0; /* special case: mgmt socket */
if(connect(kernel_fd, (struct sockaddr *)&sax, sizeof(sax) ) < 0 ) {
perror("connect failed");
result = -errno;
goto err;
}
}
2. Create session PPPoX data socket
struct sockaddr_pppol2tp sax;
int fd;
/* Note, the target socket must be bound already, else it will not be ready */
sax.sa_family = AF_PPPOX;
sax.sa_protocol = PX_PROTO_OL2TP;
sax.pppol2tp.fd = tunnel_fd;
sax.pppol2tp.addr.sin_addr.s_addr = addr->sin_addr.s_addr;
sax.pppol2tp.addr.sin_port = addr->sin_port;
sax.pppol2tp.addr.sin_family = AF_INET;
sax.pppol2tp.s_tunnel = tunnel_id;
sax.pppol2tp.s_session = session_id;
sax.pppol2tp.d_tunnel = peer_tunnel_id;
sax.pppol2tp.d_session = peer_session_id;
/* session_fd is the fd of the session's PPPoL2TP socket.
* tunnel_fd is the fd of the tunnel UDP socket.
*/
fd = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax));
if (fd < 0 ) {
return -errno;
}
return 0;
Internal Implementation
=======================
The driver keeps a struct l2tp_tunnel context per L2TP tunnel and a
struct l2tp_session context for each session. The l2tp_tunnel is
always associated with a UDP or L2TP/IP socket and keeps a list of
sessions in the tunnel. The l2tp_session context keeps kernel state
about the session. It has private data which is used for data specific
to the session type. With L2TPv2, the session always carried PPP
traffic. With L2TPv3, the session can also carry ethernet frames
(ethernet pseudowire) or other data types such as ATM, HDLC or Frame
Relay.
When a tunnel is first opened, the reference count on the socket is
increased using sock_hold(). This ensures that the kernel socket
cannot be removed while L2TP's data structures reference it.
Some L2TP sessions also have a socket (PPP pseudowires) while others
do not (ethernet pseudowires). We can't use the socket reference count
as the reference count for session contexts. The L2TP implementation
therefore has its own internal reference counts on the session
contexts.
To Do
=====
Add L2TP tunnel switching support. This would route tunneled traffic
from one L2TP tunnel into another. Specified in
http://tools.ietf.org/html/draft-ietf-l2tpext-tunnel-switching-08
Add L2TPv3 VLAN pseudowire support.
Add L2TPv3 IP pseudowire support.
Add L2TPv3 ATM pseudowire support.
Miscellaneous
=============
The L2TP drivers were developed as part of the OpenL2TP project by
Katalix Systems Ltd. OpenL2TP is a full-featured L2TP client / server,
designed from the ground up to have the L2TP datapath in the
kernel. The project also implemented the pppol2tp plugin for pppd
which allows pppd to use the kernel driver. Details can be found at
http://www.openl2tp.org.

View file

@ -0,0 +1,263 @@
The Linux LAPB Module Interface 1.3
Jonathan Naylor 29.12.96
Changed (Henner Eisen, 2000-10-29): int return value for data_indication()
The LAPB module will be a separately compiled module for use by any parts of
the Linux operating system that require a LAPB service. This document
defines the interfaces to, and the services provided by this module. The
term module in this context does not imply that the LAPB module is a
separately loadable module, although it may be. The term module is used in
its more standard meaning.
The interface to the LAPB module consists of functions to the module,
callbacks from the module to indicate important state changes, and
structures for getting and setting information about the module.
Structures
----------
Probably the most important structure is the skbuff structure for holding
received and transmitted data, however it is beyond the scope of this
document.
The two LAPB specific structures are the LAPB initialisation structure and
the LAPB parameter structure. These will be defined in a standard header
file, <linux/lapb.h>. The header file <net/lapb.h> is internal to the LAPB
module and is not for use.
LAPB Initialisation Structure
-----------------------------
This structure is used only once, in the call to lapb_register (see below).
It contains information about the device driver that requires the services
of the LAPB module.
struct lapb_register_struct {
void (*connect_confirmation)(int token, int reason);
void (*connect_indication)(int token, int reason);
void (*disconnect_confirmation)(int token, int reason);
void (*disconnect_indication)(int token, int reason);
int (*data_indication)(int token, struct sk_buff *skb);
void (*data_transmit)(int token, struct sk_buff *skb);
};
Each member of this structure corresponds to a function in the device driver
that is called when a particular event in the LAPB module occurs. These will
be described in detail below. If a callback is not required (!!) then a NULL
may be substituted.
LAPB Parameter Structure
------------------------
This structure is used with the lapb_getparms and lapb_setparms functions
(see below). They are used to allow the device driver to get and set the
operational parameters of the LAPB implementation for a given connection.
struct lapb_parms_struct {
unsigned int t1;
unsigned int t1timer;
unsigned int t2;
unsigned int t2timer;
unsigned int n2;
unsigned int n2count;
unsigned int window;
unsigned int state;
unsigned int mode;
};
T1 and T2 are protocol timing parameters and are given in units of 100ms. N2
is the maximum number of tries on the link before it is declared a failure.
The window size is the maximum number of outstanding data packets allowed to
be unacknowledged by the remote end, the value of the window is between 1
and 7 for a standard LAPB link, and between 1 and 127 for an extended LAPB
link.
The mode variable is a bit field used for setting (at present) three values.
The bit fields have the following meanings:
Bit Meaning
0 LAPB operation (0=LAPB_STANDARD 1=LAPB_EXTENDED).
1 [SM]LP operation (0=LAPB_SLP 1=LAPB=MLP).
2 DTE/DCE operation (0=LAPB_DTE 1=LAPB_DCE)
3-31 Reserved, must be 0.
Extended LAPB operation indicates the use of extended sequence numbers and
consequently larger window sizes, the default is standard LAPB operation.
MLP operation is the same as SLP operation except that the addresses used by
LAPB are different to indicate the mode of operation, the default is Single
Link Procedure. The difference between DCE and DTE operation is (i) the
addresses used for commands and responses, and (ii) when the DCE is not
connected, it sends DM without polls set, every T1. The upper case constant
names will be defined in the public LAPB header file.
Functions
---------
The LAPB module provides a number of function entry points.
int lapb_register(void *token, struct lapb_register_struct);
This must be called before the LAPB module may be used. If the call is
successful then LAPB_OK is returned. The token must be a unique identifier
generated by the device driver to allow for the unique identification of the
instance of the LAPB link. It is returned by the LAPB module in all of the
callbacks, and is used by the device driver in all calls to the LAPB module.
For multiple LAPB links in a single device driver, multiple calls to
lapb_register must be made. The format of the lapb_register_struct is given
above. The return values are:
LAPB_OK LAPB registered successfully.
LAPB_BADTOKEN Token is already registered.
LAPB_NOMEM Out of memory
int lapb_unregister(void *token);
This releases all the resources associated with a LAPB link. Any current
LAPB link will be abandoned without further messages being passed. After
this call, the value of token is no longer valid for any calls to the LAPB
function. The valid return values are:
LAPB_OK LAPB unregistered successfully.
LAPB_BADTOKEN Invalid/unknown LAPB token.
int lapb_getparms(void *token, struct lapb_parms_struct *parms);
This allows the device driver to get the values of the current LAPB
variables, the lapb_parms_struct is described above. The valid return values
are:
LAPB_OK LAPB getparms was successful.
LAPB_BADTOKEN Invalid/unknown LAPB token.
int lapb_setparms(void *token, struct lapb_parms_struct *parms);
This allows the device driver to set the values of the current LAPB
variables, the lapb_parms_struct is described above. The values of t1timer,
t2timer and n2count are ignored, likewise changing the mode bits when
connected will be ignored. An error implies that none of the values have
been changed. The valid return values are:
LAPB_OK LAPB getparms was successful.
LAPB_BADTOKEN Invalid/unknown LAPB token.
LAPB_INVALUE One of the values was out of its allowable range.
int lapb_connect_request(void *token);
Initiate a connect using the current parameter settings. The valid return
values are:
LAPB_OK LAPB is starting to connect.
LAPB_BADTOKEN Invalid/unknown LAPB token.
LAPB_CONNECTED LAPB module is already connected.
int lapb_disconnect_request(void *token);
Initiate a disconnect. The valid return values are:
LAPB_OK LAPB is starting to disconnect.
LAPB_BADTOKEN Invalid/unknown LAPB token.
LAPB_NOTCONNECTED LAPB module is not connected.
int lapb_data_request(void *token, struct sk_buff *skb);
Queue data with the LAPB module for transmitting over the link. If the call
is successful then the skbuff is owned by the LAPB module and may not be
used by the device driver again. The valid return values are:
LAPB_OK LAPB has accepted the data.
LAPB_BADTOKEN Invalid/unknown LAPB token.
LAPB_NOTCONNECTED LAPB module is not connected.
int lapb_data_received(void *token, struct sk_buff *skb);
Queue data with the LAPB module which has been received from the device. It
is expected that the data passed to the LAPB module has skb->data pointing
to the beginning of the LAPB data. If the call is successful then the skbuff
is owned by the LAPB module and may not be used by the device driver again.
The valid return values are:
LAPB_OK LAPB has accepted the data.
LAPB_BADTOKEN Invalid/unknown LAPB token.
Callbacks
---------
These callbacks are functions provided by the device driver for the LAPB
module to call when an event occurs. They are registered with the LAPB
module with lapb_register (see above) in the structure lapb_register_struct
(see above).
void (*connect_confirmation)(void *token, int reason);
This is called by the LAPB module when a connection is established after
being requested by a call to lapb_connect_request (see above). The reason is
always LAPB_OK.
void (*connect_indication)(void *token, int reason);
This is called by the LAPB module when the link is established by the remote
system. The value of reason is always LAPB_OK.
void (*disconnect_confirmation)(void *token, int reason);
This is called by the LAPB module when an event occurs after the device
driver has called lapb_disconnect_request (see above). The reason indicates
what has happened. In all cases the LAPB link can be regarded as being
terminated. The values for reason are:
LAPB_OK The LAPB link was terminated normally.
LAPB_NOTCONNECTED The remote system was not connected.
LAPB_TIMEDOUT No response was received in N2 tries from the remote
system.
void (*disconnect_indication)(void *token, int reason);
This is called by the LAPB module when the link is terminated by the remote
system or another event has occurred to terminate the link. This may be
returned in response to a lapb_connect_request (see above) if the remote
system refused the request. The values for reason are:
LAPB_OK The LAPB link was terminated normally by the remote
system.
LAPB_REFUSED The remote system refused the connect request.
LAPB_NOTCONNECTED The remote system was not connected.
LAPB_TIMEDOUT No response was received in N2 tries from the remote
system.
int (*data_indication)(void *token, struct sk_buff *skb);
This is called by the LAPB module when data has been received from the
remote system that should be passed onto the next layer in the protocol
stack. The skbuff becomes the property of the device driver and the LAPB
module will not perform any more actions on it. The skb->data pointer will
be pointing to the first byte of data after the LAPB header.
This method should return NET_RX_DROP (as defined in the header
file include/linux/netdevice.h) if and only if the frame was dropped
before it could be delivered to the upper layer.
void (*data_transmit)(void *token, struct sk_buff *skb);
This is called by the LAPB module when data is to be transmitted to the
remote system by the device driver. The skbuff becomes the property of the
device driver and the LAPB module will not perform any more actions on it.
The skb->data pointer will be pointing to the first byte of the LAPB header.

View file

@ -0,0 +1,131 @@
This is the ALPHA version of the ltpc driver.
In order to use it, you will need at least version 1.3.3 of the
netatalk package, and the Apple or Farallon LocalTalk PC card.
There are a number of different LocalTalk cards for the PC; this
driver applies only to the one with the 65c02 processor chip on it.
To include it in the kernel, select the CONFIG_LTPC switch in the
configuration dialog. You can also compile it as a module.
While the driver will attempt to autoprobe the I/O port address, IRQ
line, and DMA channel of the card, this does not always work. For
this reason, you should be prepared to supply these parameters
yourself. (see "Card Configuration" below for how to determine or
change the settings on your card)
When the driver is compiled into the kernel, you can add a line such
as the following to your /etc/lilo.conf:
append="ltpc=0x240,9,1"
where the parameters (in order) are the port address, IRQ, and DMA
channel. The second and third values can be omitted, in which case
the driver will try to determine them itself.
If you load the driver as a module, you can pass the parameters "io=",
"irq=", and "dma=" on the command line with insmod or modprobe, or add
them as options in a configuration file in /etc/modprobe.d/ directory:
alias lt0 ltpc # autoload the module when the interface is configured
options ltpc io=0x240 irq=9 dma=1
Before starting up the netatalk demons (perhaps in rc.local), you
need to add a line such as:
/sbin/ifconfig lt0 127.0.0.42
The address is unimportant - however, the card needs to be configured
with ifconfig so that Netatalk can find it.
The appropriate netatalk configuration depends on whether you are
attached to a network that includes AppleTalk routers or not. If,
like me, you are simply connecting to your home Macintoshes and
printers, you need to set up netatalk to "seed". The way I do this
is to have the lines
dummy -seed -phase 2 -net 2000 -addr 2000.26 -zone "1033"
lt0 -seed -phase 1 -net 1033 -addr 1033.27 -zone "1033"
in my atalkd.conf. What is going on here is that I need to fool
netatalk into thinking that there are two AppleTalk interfaces
present; otherwise, it refuses to seed. This is a hack, and a more
permanent solution would be to alter the netatalk code. Also, make
sure you have the correct name for the dummy interface - If it's
compiled as a module, you will need to refer to it as "dummy0" or some
such.
If you are attached to an extended AppleTalk network, with routers on
it, then you don't need to fool around with this -- the appropriate
line in atalkd.conf is
lt0 -phase 1
--------------------------------------
Card Configuration:
The interrupts and so forth are configured via the dipswitch on the
board. Set the switches so as not to conflict with other hardware.
Interrupts -- set at most one. If none are set, the driver uses
polled mode. Because the card was developed in the XT era, the
original documentation refers to IRQ2. Since you'll be running
this on an AT (or later) class machine, that really means IRQ9.
SW1 IRQ 4
SW2 IRQ 3
SW3 IRQ 9 (2 in original card documentation only applies to XT)
DMA -- choose DMA 1 or 3, and set both corresponding switches.
SW4 DMA 3
SW5 DMA 1
SW6 DMA 3
SW7 DMA 1
I/O address -- choose one.
SW8 220 / 240
--------------------------------------
IP:
Yes, it is possible to do IP over LocalTalk. However, you can't just
treat the LocalTalk device like an ordinary Ethernet device, even if
that's what it looks like to Netatalk.
Instead, you follow the same procedure as for doing IP in EtherTalk.
See Documentation/networking/ipddp.txt for more information about the
kernel driver and userspace tools needed.
--------------------------------------
BUGS:
IRQ autoprobing often doesn't work on a cold boot. To get around
this, either compile the driver as a module, or pass the parameters
for the card to the kernel as described above.
Also, as usual, autoprobing is not recommended when you use the driver
as a module. (though it usually works at boot time, at least)
Polled mode is *really* slow sometimes, but this seems to depend on
the configuration of the network.
It may theoretically be possible to use two LTPC cards in the same
machine, but this is unsupported, so if you really want to do this,
you'll probably have to hack the initialization code a bit.
______________________________________
THANKS:
Thanks to Alan Cox for helpful discussions early on in this
work, and to Denis Hainsworth for doing the bleeding-edge testing.
-- Bradford Johnson <bradford@math.umn.edu>
-- Updated 11/09/1998 by David Huggins-Daines <dhd@debian.org>

View file

@ -0,0 +1,95 @@
#
# This outlines the Linux authentication/association and
# deauthentication/disassociation flows.
#
# This can be converted into a diagram using the service
# at http://www.websequencediagrams.com/
#
participant userspace
participant mac80211
participant driver
alt authentication needed (not FT)
userspace->mac80211: authenticate
alt authenticated/authenticating already
mac80211->driver: sta_state(AP, not-exists)
mac80211->driver: bss_info_changed(clear BSSID)
else associated
note over mac80211,driver
like deauth/disassoc, without sending the
BA session stop & deauth/disassoc frames
end note
end
mac80211->driver: config(channel, channel type)
mac80211->driver: bss_info_changed(set BSSID, basic rate bitmap)
mac80211->driver: sta_state(AP, exists)
alt no probe request data known
mac80211->driver: TX directed probe request
driver->mac80211: RX probe response
end
mac80211->driver: TX auth frame
driver->mac80211: RX auth frame
alt WEP shared key auth
mac80211->driver: TX auth frame
driver->mac80211: RX auth frame
end
mac80211->driver: sta_state(AP, authenticated)
mac80211->userspace: RX auth frame
end
userspace->mac80211: associate
alt authenticated or associated
note over mac80211,driver: cleanup like for authenticate
end
alt not previously authenticated (FT)
mac80211->driver: config(channel, channel type)
mac80211->driver: bss_info_changed(set BSSID, basic rate bitmap)
mac80211->driver: sta_state(AP, exists)
mac80211->driver: sta_state(AP, authenticated)
end
mac80211->driver: TX assoc
driver->mac80211: RX assoc response
note over mac80211: init rate control
mac80211->driver: sta_state(AP, associated)
alt not using WPA
mac80211->driver: sta_state(AP, authorized)
end
mac80211->driver: set up QoS parameters
mac80211->driver: bss_info_changed(QoS, HT, associated with AID)
mac80211->userspace: associated
note left of userspace: associated now
alt using WPA
note over userspace
do 4-way-handshake
(data frames)
end note
userspace->mac80211: authorized
mac80211->driver: sta_state(AP, authorized)
end
userspace->mac80211: deauthenticate/disassociate
mac80211->driver: stop BA sessions
mac80211->driver: TX deauth/disassoc
mac80211->driver: flush frames
mac80211->driver: sta_state(AP,associated)
mac80211->driver: sta_state(AP,authenticated)
mac80211->driver: sta_state(AP,exists)
mac80211->driver: sta_state(AP,not-exists)
mac80211->driver: turn off powersave
mac80211->driver: bss_info_changed(clear BSSID, not associated, no QoS, ...)
mac80211->driver: config(channel type to non-HT)
mac80211->userspace: disconnected

View file

@ -0,0 +1,67 @@
How to use packet injection with mac80211
=========================================
mac80211 now allows arbitrary packets to be injected down any Monitor Mode
interface from userland. The packet you inject needs to be composed in the
following format:
[ radiotap header ]
[ ieee80211 header ]
[ payload ]
The radiotap format is discussed in
./Documentation/networking/radiotap-headers.txt.
Despite many radiotap parameters being currently defined, most only make sense
to appear on received packets. The following information is parsed from the
radiotap headers and used to control injection:
* IEEE80211_RADIOTAP_FLAGS
IEEE80211_RADIOTAP_F_FCS: FCS will be removed and recalculated
IEEE80211_RADIOTAP_F_WEP: frame will be encrypted if key available
IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the
current fragmentation threshold.
* IEEE80211_RADIOTAP_TX_FLAGS
IEEE80211_RADIOTAP_F_TX_NOACK: frame should be sent without waiting for
an ACK even if it is a unicast frame
The injection code can also skip all other currently defined radiotap fields
facilitating replay of captured radiotap headers directly.
Here is an example valid radiotap header defining some parameters
0x00, 0x00, // <-- radiotap version
0x0b, 0x00, // <- radiotap header length
0x04, 0x0c, 0x00, 0x00, // <-- bitmap
0x6c, // <-- rate
0x0c, //<-- tx power
0x01 //<-- antenna
The ieee80211 header follows immediately afterwards, looking for example like
this:
0x08, 0x01, 0x00, 0x00,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0x13, 0x22, 0x33, 0x44, 0x55, 0x66,
0x13, 0x22, 0x33, 0x44, 0x55, 0x66,
0x10, 0x86
Then lastly there is the payload.
After composing the packet contents, it is sent by send()-ing it to a logical
mac80211 interface that is in Monitor mode. Libpcap can also be used,
(which is easier than doing the work to bind the socket to the right
interface), along the following lines:
ppcap = pcap_open_live(szInterfaceName, 800, 1, 20, szErrbuf);
...
r = pcap_inject(ppcap, u8aSendBuffer, nLength);
You can also find a link to a complete inject application here:
http://wireless.kernel.org/en/users/Documentation/packetspammer
Andy Green <andy@warmcat.com>

View file

@ -0,0 +1,68 @@
mac80211_hwsim - software simulator of 802.11 radio(s) for mac80211
Copyright (c) 2008, Jouni Malinen <j@w1.fi>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License version 2 as
published by the Free Software Foundation.
Introduction
mac80211_hwsim is a Linux kernel module that can be used to simulate
arbitrary number of IEEE 802.11 radios for mac80211. It can be used to
test most of the mac80211 functionality and user space tools (e.g.,
hostapd and wpa_supplicant) in a way that matches very closely with
the normal case of using real WLAN hardware. From the mac80211 view
point, mac80211_hwsim is yet another hardware driver, i.e., no changes
to mac80211 are needed to use this testing tool.
The main goal for mac80211_hwsim is to make it easier for developers
to test their code and work with new features to mac80211, hostapd,
and wpa_supplicant. The simulated radios do not have the limitations
of real hardware, so it is easy to generate an arbitrary test setup
and always reproduce the same setup for future tests. In addition,
since all radio operation is simulated, any channel can be used in
tests regardless of regulatory rules.
mac80211_hwsim kernel module has a parameter 'radios' that can be used
to select how many radios are simulated (default 2). This allows
configuration of both very simply setups (e.g., just a single access
point and a station) or large scale tests (multiple access points with
hundreds of stations).
mac80211_hwsim works by tracking the current channel of each virtual
radio and copying all transmitted frames to all other radios that are
currently enabled and on the same channel as the transmitting
radio. Software encryption in mac80211 is used so that the frames are
actually encrypted over the virtual air interface to allow more
complete testing of encryption.
A global monitoring netdev, hwsim#, is created independent of
mac80211. This interface can be used to monitor all transmitted frames
regardless of channel.
Simple example
This example shows how to use mac80211_hwsim to simulate two radios:
one to act as an access point and the other as a station that
associates with the AP. hostapd and wpa_supplicant are used to take
care of WPA2-PSK authentication. In addition, hostapd is also
processing access point side of association.
# Build mac80211_hwsim as part of kernel configuration
# Load the module
modprobe mac80211_hwsim
# Run hostapd (AP) for wlan0
hostapd hostapd.conf
# Run wpa_supplicant (station) for wlan1
wpa_supplicant -Dwext -iwlan1 -c wpa_supplicant.conf
More test cases are available in hostap.git:
git://w1.fi/srv/git/hostap.git and mac80211_hwsim/tests subdirectory
(http://w1.fi/gitweb/gitweb.cgi?p=hostap.git;a=tree;f=mac80211_hwsim/tests)

View file

@ -0,0 +1,11 @@
interface=wlan0
driver=nl80211
hw_mode=g
channel=1
ssid=mac80211 test
wpa=2
wpa_key_mgmt=WPA-PSK
wpa_pairwise=CCMP
wpa_passphrase=12345678

View file

@ -0,0 +1,10 @@
ctrl_interface=/var/run/wpa_supplicant
network={
ssid="mac80211 test"
psk="12345678"
key_mgmt=WPA-PSK
proto=WPA2
pairwise=CCMP
group=CCMP
}

View file

@ -0,0 +1,79 @@
HOWTO for multiqueue network device support
===========================================
Section 1: Base driver requirements for implementing multiqueue support
Intro: Kernel support for multiqueue devices
---------------------------------------------------------
Kernel support for multiqueue devices is always present.
Section 1: Base driver requirements for implementing multiqueue support
-----------------------------------------------------------------------
Base drivers are required to use the new alloc_etherdev_mq() or
alloc_netdev_mq() functions to allocate the subqueues for the device. The
underlying kernel API will take care of the allocation and deallocation of
the subqueue memory, as well as netdev configuration of where the queues
exist in memory.
The base driver will also need to manage the queues as it does the global
netdev->queue_lock today. Therefore base drivers should use the
netif_{start|stop|wake}_subqueue() functions to manage each queue while the
device is still operational. netdev->queue_lock is still used when the device
comes online or when it's completely shut down (unregister_netdev(), etc.).
Section 2: Qdisc support for multiqueue devices
-----------------------------------------------
Currently two qdiscs are optimized for multiqueue devices. The first is the
default pfifo_fast qdisc. This qdisc supports one qdisc per hardware queue.
A new round-robin qdisc, sch_multiq also supports multiple hardware queues. The
qdisc is responsible for classifying the skb's and then directing the skb's to
bands and queues based on the value in skb->queue_mapping. Use this field in
the base driver to determine which queue to send the skb to.
sch_multiq has been added for hardware that wishes to avoid head-of-line
blocking. It will cycle though the bands and verify that the hardware queue
associated with the band is not stopped prior to dequeuing a packet.
On qdisc load, the number of bands is based on the number of queues on the
hardware. Once the association is made, any skb with skb->queue_mapping set,
will be queued to the band associated with the hardware queue.
Section 3: Brief howto using MULTIQ for multiqueue devices
---------------------------------------------------------------
The userspace command 'tc,' part of the iproute2 package, is used to configure
qdiscs. To add the MULTIQ qdisc to your network device, assuming the device
is called eth0, run the following command:
# tc qdisc add dev eth0 root handle 1: multiq
The qdisc will allocate the number of bands to equal the number of queues that
the device reports, and bring the qdisc online. Assuming eth0 has 4 Tx
queues, the band mapping would look like:
band 0 => queue 0
band 1 => queue 1
band 2 => queue 2
band 3 => queue 3
Traffic will begin flowing through each queue based on either the simple_tx_hash
function or based on netdev->select_queue() if you have it defined.
The behavior of tc filters remains the same. However a new tc action,
skbedit, has been added. Assuming you wanted to route all traffic to a
specific host, for example 192.168.0.3, through a specific queue you could use
this action and establish a filter such as:
tc filter add dev eth0 parent 1: protocol ip prio 1 u32 \
match ip dst 192.168.0.3 \
action skbedit queue_mapping 3
Author: Alexander Duyck <alexander.h.duyck@intel.com>
Original Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>

View file

@ -0,0 +1,177 @@
started by Ingo Molnar <mingo@redhat.com>, 2001.09.17
2.6 port and netpoll api by Matt Mackall <mpm@selenic.com>, Sep 9 2003
IPv6 support by Cong Wang <xiyou.wangcong@gmail.com>, Jan 1 2013
Please send bug reports to Matt Mackall <mpm@selenic.com>
Satyam Sharma <satyam.sharma@gmail.com>, and Cong Wang <xiyou.wangcong@gmail.com>
Introduction:
=============
This module logs kernel printk messages over UDP allowing debugging of
problem where disk logging fails and serial consoles are impractical.
It can be used either built-in or as a module. As a built-in,
netconsole initializes immediately after NIC cards and will bring up
the specified interface as soon as possible. While this doesn't allow
capture of early kernel panics, it does capture most of the boot
process.
Sender and receiver configuration:
==================================
It takes a string configuration parameter "netconsole" in the
following format:
netconsole=[src-port]@[src-ip]/[<dev>],[tgt-port]@<tgt-ip>/[tgt-macaddr]
where
src-port source for UDP packets (defaults to 6665)
src-ip source IP to use (interface address)
dev network interface (eth0)
tgt-port port for logging agent (6666)
tgt-ip IP address for logging agent
tgt-macaddr ethernet MAC address for logging agent (broadcast)
Examples:
linux netconsole=4444@10.0.0.1/eth1,9353@10.0.0.2/12:34:56:78:9a:bc
or
insmod netconsole netconsole=@/,@10.0.0.2/
or using IPv6
insmod netconsole netconsole=@/,@fd00:1:2:3::1/
It also supports logging to multiple remote agents by specifying
parameters for the multiple agents separated by semicolons and the
complete string enclosed in "quotes", thusly:
modprobe netconsole netconsole="@/,@10.0.0.2/;@/eth1,6892@10.0.0.3/"
Built-in netconsole starts immediately after the TCP stack is
initialized and attempts to bring up the supplied dev at the supplied
address.
The remote host has several options to receive the kernel messages,
for example:
1) syslogd
2) netcat
On distributions using a BSD-based netcat version (e.g. Fedora,
openSUSE and Ubuntu) the listening port must be specified without
the -p switch:
'nc -u -l -p <port>' / 'nc -u -l <port>' or
'netcat -u -l -p <port>' / 'netcat -u -l <port>'
3) socat
'socat udp-recv:<port> -'
Dynamic reconfiguration:
========================
Dynamic reconfigurability is a useful addition to netconsole that enables
remote logging targets to be dynamically added, removed, or have their
parameters reconfigured at runtime from a configfs-based userspace interface.
[ Note that the parameters of netconsole targets that were specified/created
from the boot/module option are not exposed via this interface, and hence
cannot be modified dynamically. ]
To include this feature, select CONFIG_NETCONSOLE_DYNAMIC when building the
netconsole module (or kernel, if netconsole is built-in).
Some examples follow (where configfs is mounted at the /sys/kernel/config
mountpoint).
To add a remote logging target (target names can be arbitrary):
cd /sys/kernel/config/netconsole/
mkdir target1
Note that newly created targets have default parameter values (as mentioned
above) and are disabled by default -- they must first be enabled by writing
"1" to the "enabled" attribute (usually after setting parameters accordingly)
as described below.
To remove a target:
rmdir /sys/kernel/config/netconsole/othertarget/
The interface exposes these parameters of a netconsole target to userspace:
enabled Is this target currently enabled? (read-write)
dev_name Local network interface name (read-write)
local_port Source UDP port to use (read-write)
remote_port Remote agent's UDP port (read-write)
local_ip Source IP address to use (read-write)
remote_ip Remote agent's IP address (read-write)
local_mac Local interface's MAC address (read-only)
remote_mac Remote agent's MAC address (read-write)
The "enabled" attribute is also used to control whether the parameters of
a target can be updated or not -- you can modify the parameters of only
disabled targets (i.e. if "enabled" is 0).
To update a target's parameters:
cat enabled # check if enabled is 1
echo 0 > enabled # disable the target (if required)
echo eth2 > dev_name # set local interface
echo 10.0.0.4 > remote_ip # update some parameter
echo cb:a9:87:65:43:21 > remote_mac # update more parameters
echo 1 > enabled # enable target again
You can also update the local interface dynamically. This is especially
useful if you want to use interfaces that have newly come up (and may not
have existed when netconsole was loaded / initialized).
Miscellaneous notes:
====================
WARNING: the default target ethernet setting uses the broadcast
ethernet address to send packets, which can cause increased load on
other systems on the same ethernet segment.
TIP: some LAN switches may be configured to suppress ethernet broadcasts
so it is advised to explicitly specify the remote agents' MAC addresses
from the config parameters passed to netconsole.
TIP: to find out the MAC address of, say, 10.0.0.2, you may try using:
ping -c 1 10.0.0.2 ; /sbin/arp -n | grep 10.0.0.2
TIP: in case the remote logging agent is on a separate LAN subnet than
the sender, it is suggested to try specifying the MAC address of the
default gateway (you may use /sbin/route -n to find it out) as the
remote MAC address instead.
NOTE: the network device (eth1 in the above case) can run any kind
of other network traffic, netconsole is not intrusive. Netconsole
might cause slight delays in other traffic if the volume of kernel
messages is high, but should have no other impact.
NOTE: if you find that the remote logging agent is not receiving or
printing all messages from the sender, it is likely that you have set
the "console_loglevel" parameter (on the sender) to only send high
priority messages to the console. You can change this at runtime using:
dmesg -n 8
or by specifying "debug" on the kernel command line at boot, to send
all kernel messages to the console. A specific value for this parameter
can also be set using the "loglevel" kernel boot option. See the
dmesg(8) man page and Documentation/kernel-parameters.txt for details.
Netconsole was designed to be as instantaneous as possible, to
enable the logging of even the most critical kernel bugs. It works
from IRQ contexts as well, and does not enable interrupts while
sending packets. Due to these unique needs, configuration cannot
be more automatic, and some fundamental limitations will remain:
only IP networks, UDP packets and ethernet devices are supported.

View file

@ -0,0 +1,224 @@
Information you need to know about netdev
-----------------------------------------
Q: What is netdev?
A: It is a mailing list for all network-related Linux stuff. This includes
anything found under net/ (i.e. core code like IPv6) and drivers/net
(i.e. hardware specific drivers) in the Linux source tree.
Note that some subsystems (e.g. wireless drivers) which have a high volume
of traffic have their own specific mailing lists.
The netdev list is managed (like many other Linux mailing lists) through
VGER ( http://vger.kernel.org/ ) and archives can be found below:
http://marc.info/?l=linux-netdev
http://www.spinics.net/lists/netdev/
Aside from subsystems like that mentioned above, all network-related Linux
development (i.e. RFC, review, comments, etc.) takes place on netdev.
Q: How do the changes posted to netdev make their way into Linux?
A: There are always two trees (git repositories) in play. Both are driven
by David Miller, the main network maintainer. There is the "net" tree,
and the "net-next" tree. As you can probably guess from the names, the
net tree is for fixes to existing code already in the mainline tree from
Linus, and net-next is where the new code goes for the future release.
You can find the trees here:
http://git.kernel.org/?p=linux/kernel/git/davem/net.git
http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git
Q: How often do changes from these trees make it to the mainline Linus tree?
A: To understand this, you need to know a bit of background information
on the cadence of Linux development. Each new release starts off with
a two week "merge window" where the main maintainers feed their new
stuff to Linus for merging into the mainline tree. After the two weeks,
the merge window is closed, and it is called/tagged "-rc1". No new
features get mainlined after this -- only fixes to the rc1 content
are expected. After roughly a week of collecting fixes to the rc1
content, rc2 is released. This repeats on a roughly weekly basis
until rc7 (typically; sometimes rc6 if things are quiet, or rc8 if
things are in a state of churn), and a week after the last vX.Y-rcN
was done, the official "vX.Y" is released.
Relating that to netdev: At the beginning of the 2-week merge window,
the net-next tree will be closed - no new changes/features. The
accumulated new content of the past ~10 weeks will be passed onto
mainline/Linus via a pull request for vX.Y -- at the same time,
the "net" tree will start accumulating fixes for this pulled content
relating to vX.Y
An announcement indicating when net-next has been closed is usually
sent to netdev, but knowing the above, you can predict that in advance.
IMPORTANT: Do not send new net-next content to netdev during the
period during which net-next tree is closed.
Shortly after the two weeks have passed (and vX.Y-rc1 is released), the
tree for net-next reopens to collect content for the next (vX.Y+1) release.
If you aren't subscribed to netdev and/or are simply unsure if net-next
has re-opened yet, simply check the net-next git repository link above for
any new networking-related commits.
The "net" tree continues to collect fixes for the vX.Y content, and
is fed back to Linus at regular (~weekly) intervals. Meaning that the
focus for "net" is on stabilization and bugfixes.
Finally, the vX.Y gets released, and the whole cycle starts over.
Q: So where are we now in this cycle?
A: Load the mainline (Linus) page here:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git
and note the top of the "tags" section. If it is rc1, it is early
in the dev cycle. If it was tagged rc7 a week ago, then a release
is probably imminent.
Q: How do I indicate which tree (net vs. net-next) my patch should be in?
A: Firstly, think whether you have a bug fix or new "next-like" content.
Then once decided, assuming that you use git, use the prefix flag, i.e.
git format-patch --subject-prefix='PATCH net-next' start..finish
Use "net" instead of "net-next" (always lower case) in the above for
bug-fix net content. If you don't use git, then note the only magic in
the above is just the subject text of the outgoing e-mail, and you can
manually change it yourself with whatever MUA you are comfortable with.
Q: I sent a patch and I'm wondering what happened to it. How can I tell
whether it got merged?
A: Start by looking at the main patchworks queue for netdev:
http://patchwork.ozlabs.org/project/netdev/list/
The "State" field will tell you exactly where things are at with
your patch.
Q: The above only says "Under Review". How can I find out more?
A: Generally speaking, the patches get triaged quickly (in less than 48h).
So be patient. Asking the maintainer for status updates on your
patch is a good way to ensure your patch is ignored or pushed to
the bottom of the priority list.
Q: How can I tell what patches are queued up for backporting to the
various stable releases?
A: Normally Greg Kroah-Hartman collects stable commits himself, but
for networking, Dave collects up patches he deems critical for the
networking subsystem, and then hands them off to Greg.
There is a patchworks queue that you can see here:
http://patchwork.ozlabs.org/bundle/davem/stable/?state=*
It contains the patches which Dave has selected, but not yet handed
off to Greg. If Greg already has the patch, then it will be here:
http://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git
A quick way to find whether the patch is in this stable-queue is
to simply clone the repo, and then git grep the mainline commit ID, e.g.
stable-queue$ git grep -l 284041ef21fdf2e
releases/3.0.84/ipv6-fix-possible-crashes-in-ip6_cork_release.patch
releases/3.4.51/ipv6-fix-possible-crashes-in-ip6_cork_release.patch
releases/3.9.8/ipv6-fix-possible-crashes-in-ip6_cork_release.patch
stable/stable-queue$
Q: I see a network patch and I think it should be backported to stable.
Should I request it via "stable@vger.kernel.org" like the references in
the kernel's Documentation/stable_kernel_rules.txt file say?
A: No, not for networking. Check the stable queues as per above 1st to see
if it is already queued. If not, then send a mail to netdev, listing
the upstream commit ID and why you think it should be a stable candidate.
Before you jump to go do the above, do note that the normal stable rules
in Documentation/stable_kernel_rules.txt still apply. So you need to
explicitly indicate why it is a critical fix and exactly what users are
impacted. In addition, you need to convince yourself that you _really_
think it has been overlooked, vs. having been considered and rejected.
Generally speaking, the longer it has had a chance to "soak" in mainline,
the better the odds that it is an OK candidate for stable. So scrambling
to request a commit be added the day after it appears should be avoided.
Q: I have created a network patch and I think it should be backported to
stable. Should I add a "Cc: stable@vger.kernel.org" like the references
in the kernel's Documentation/ directory say?
A: No. See above answer. In short, if you think it really belongs in
stable, then ensure you write a decent commit log that describes who
gets impacted by the bugfix and how it manifests itself, and when the
bug was introduced. If you do that properly, then the commit will
get handled appropriately and most likely get put in the patchworks
stable queue if it really warrants it.
If you think there is some valid information relating to it being in
stable that does _not_ belong in the commit log, then use the three
dash marker line as described in Documentation/SubmittingPatches to
temporarily embed that information into the patch that you send.
Q: Someone said that the comment style and coding convention is different
for the networking content. Is this true?
A: Yes, in a largely trivial way. Instead of this:
/*
* foobar blah blah blah
* another line of text
*/
it is requested that you make it look like this:
/* foobar blah blah blah
* another line of text
*/
Q: I am working in existing code that has the former comment style and not the
latter. Should I submit new code in the former style or the latter?
A: Make it the latter style, so that eventually all code in the domain of
netdev is of this format.
Q: I found a bug that might have possible security implications or similar.
Should I mail the main netdev maintainer off-list?
A: No. The current netdev maintainer has consistently requested that people
use the mailing lists and not reach out directly. If you aren't OK with
that, then perhaps consider mailing "security@kernel.org" or reading about
http://oss-security.openwall.org/wiki/mailing-lists/distros
as possible alternative mechanisms.
Q: What level of testing is expected before I submit my change?
A: If your changes are against net-next, the expectation is that you
have tested by layering your changes on top of net-next. Ideally you
will have done run-time testing specific to your change, but at a
minimum, your changes should survive an "allyesconfig" and an
"allmodconfig" build without new warnings or failures.
Q: Any other tips to help ensure my net/net-next patch gets OK'd?
A: Attention to detail. Re-read your own work as if you were the
reviewer. You can start with using checkpatch.pl, perhaps even
with the "--strict" flag. But do not be mindlessly robotic in
doing so. If your change is a bug-fix, make sure your commit log
indicates the end-user visible symptom, the underlying reason as
to why it happens, and then if necessary, explain why the fix proposed
is the best way to get things done. Don't mangle whitespace, and as
is common, don't mis-indent function arguments that span multiple lines.
If it is your first patch, mail it to yourself so you can test apply
it to an unpatched tree to confirm infrastructure didn't mangle it.
Finally, go back and read Documentation/SubmittingPatches to be
sure you are not repeating some common mistake documented there.

View file

@ -0,0 +1,167 @@
Netdev features mess and how to get out from it alive
=====================================================
Author:
Michał Mirosław <mirq-linux@rere.qmqm.pl>
Part I: Feature sets
======================
Long gone are the days when a network card would just take and give packets
verbatim. Today's devices add multiple features and bugs (read: offloads)
that relieve an OS of various tasks like generating and checking checksums,
splitting packets, classifying them. Those capabilities and their state
are commonly referred to as netdev features in Linux kernel world.
There are currently three sets of features relevant to the driver, and
one used internally by network core:
1. netdev->hw_features set contains features whose state may possibly
be changed (enabled or disabled) for a particular device by user's
request. This set should be initialized in ndo_init callback and not
changed later.
2. netdev->features set contains features which are currently enabled
for a device. This should be changed only by network core or in
error paths of ndo_set_features callback.
3. netdev->vlan_features set contains features whose state is inherited
by child VLAN devices (limits netdev->features set). This is currently
used for all VLAN devices whether tags are stripped or inserted in
hardware or software.
4. netdev->wanted_features set contains feature set requested by user.
This set is filtered by ndo_fix_features callback whenever it or
some device-specific conditions change. This set is internal to
networking core and should not be referenced in drivers.
Part II: Controlling enabled features
=======================================
When current feature set (netdev->features) is to be changed, new set
is calculated and filtered by calling ndo_fix_features callback
and netdev_fix_features(). If the resulting set differs from current
set, it is passed to ndo_set_features callback and (if the callback
returns success) replaces value stored in netdev->features.
NETDEV_FEAT_CHANGE notification is issued after that whenever current
set might have changed.
The following events trigger recalculation:
1. device's registration, after ndo_init returned success
2. user requested changes in features state
3. netdev_update_features() is called
ndo_*_features callbacks are called with rtnl_lock held. Missing callbacks
are treated as always returning success.
A driver that wants to trigger recalculation must do so by calling
netdev_update_features() while holding rtnl_lock. This should not be done
from ndo_*_features callbacks. netdev->features should not be modified by
driver except by means of ndo_fix_features callback.
Part III: Implementation hints
================================
* ndo_fix_features:
All dependencies between features should be resolved here. The resulting
set can be reduced further by networking core imposed limitations (as coded
in netdev_fix_features()). For this reason it is safer to disable a feature
when its dependencies are not met instead of forcing the dependency on.
This callback should not modify hardware nor driver state (should be
stateless). It can be called multiple times between successive
ndo_set_features calls.
Callback must not alter features contained in NETIF_F_SOFT_FEATURES or
NETIF_F_NEVER_CHANGE sets. The exception is NETIF_F_VLAN_CHALLENGED but
care must be taken as the change won't affect already configured VLANs.
* ndo_set_features:
Hardware should be reconfigured to match passed feature set. The set
should not be altered unless some error condition happens that can't
be reliably detected in ndo_fix_features. In this case, the callback
should update netdev->features to match resulting hardware state.
Errors returned are not (and cannot be) propagated anywhere except dmesg.
(Note: successful return is zero, >0 means silent error.)
Part IV: Features
===================
For current list of features, see include/linux/netdev_features.h.
This section describes semantics of some of them.
* Transmit checksumming
For complete description, see comments near the top of include/linux/skbuff.h.
Note: NETIF_F_HW_CSUM is a superset of NETIF_F_IP_CSUM + NETIF_F_IPV6_CSUM.
It means that device can fill TCP/UDP-like checksum anywhere in the packets
whatever headers there might be.
* Transmit TCP segmentation offload
NETIF_F_TSO_ECN means that hardware can properly split packets with CWR bit
set, be it TCPv4 (when NETIF_F_TSO is enabled) or TCPv6 (NETIF_F_TSO6).
* Transmit DMA from high memory
On platforms where this is relevant, NETIF_F_HIGHDMA signals that
ndo_start_xmit can handle skbs with frags in high memory.
* Transmit scatter-gather
Those features say that ndo_start_xmit can handle fragmented skbs:
NETIF_F_SG --- paged skbs (skb_shinfo()->frags), NETIF_F_FRAGLIST ---
chained skbs (skb->next/prev list).
* Software features
Features contained in NETIF_F_SOFT_FEATURES are features of networking
stack. Driver should not change behaviour based on them.
* LLTX driver (deprecated for hardware drivers)
NETIF_F_LLTX should be set in drivers that implement their own locking in
transmit path or don't need locking at all (e.g. software tunnels).
In ndo_start_xmit, it is recommended to use a try_lock and return
NETDEV_TX_LOCKED when the spin lock fails. The locking should also properly
protect against other callbacks (the rules you need to find out).
Don't use it for new drivers.
* netns-local device
NETIF_F_NETNS_LOCAL is set for devices that are not allowed to move between
network namespaces (e.g. loopback).
Don't use it in drivers.
* VLAN challenged
NETIF_F_VLAN_CHALLENGED should be set for devices which can't cope with VLAN
headers. Some drivers set this because the cards can't handle the bigger MTU.
[FIXME: Those cases could be fixed in VLAN code by allowing only reduced-MTU
VLANs. This may be not useful, though.]
* rx-fcs
This requests that the NIC append the Ethernet Frame Checksum (FCS)
to the end of the skb data. This allows sniffers and other tools to
read the CRC recorded by the NIC on receipt of the packet.
* rx-all
This requests that the NIC receive all possible frames, including errored
frames (such as bad FCS, etc). This can be helpful when sniffing a link with
bad packets on it. Some NICs may receive more packets if also put into normal
PROMISC mode.

View file

@ -0,0 +1,107 @@
Network Devices, the Kernel, and You!
Introduction
============
The following is a random collection of documentation regarding
network devices.
struct net_device allocation rules
==================================
Network device structures need to persist even after module is unloaded and
must be allocated with alloc_netdev_mqs() and friends.
If device has registered successfully, it will be freed on last use
by free_netdev(). This is required to handle the pathologic case cleanly
(example: rmmod mydriver </sys/class/net/myeth/mtu )
alloc_netdev_mqs()/alloc_netdev() reserve extra space for driver
private data which gets freed when the network device is freed. If
separately allocated data is attached to the network device
(netdev_priv(dev)) then it is up to the module exit handler to free that.
MTU
===
Each network device has a Maximum Transfer Unit. The MTU does not
include any link layer protocol overhead. Upper layer protocols must
not pass a socket buffer (skb) to a device to transmit with more data
than the mtu. The MTU does not include link layer header overhead, so
for example on Ethernet if the standard MTU is 1500 bytes used, the
actual skb will contain up to 1514 bytes because of the Ethernet
header. Devices should allow for the 4 byte VLAN header as well.
Segmentation Offload (GSO, TSO) is an exception to this rule. The
upper layer protocol may pass a large socket buffer to the device
transmit routine, and the device will break that up into separate
packets based on the current MTU.
MTU is symmetrical and applies both to receive and transmit. A device
must be able to receive at least the maximum size packet allowed by
the MTU. A network device may use the MTU as mechanism to size receive
buffers, but the device should allow packets with VLAN header. With
standard Ethernet mtu of 1500 bytes, the device should allow up to
1518 byte packets (1500 + 14 header + 4 tag). The device may either:
drop, truncate, or pass up oversize packets, but dropping oversize
packets is preferred.
struct net_device synchronization rules
=======================================
ndo_open:
Synchronization: rtnl_lock() semaphore.
Context: process
ndo_stop:
Synchronization: rtnl_lock() semaphore.
Context: process
Note: netif_running() is guaranteed false
ndo_do_ioctl:
Synchronization: rtnl_lock() semaphore.
Context: process
ndo_get_stats:
Synchronization: dev_base_lock rwlock.
Context: nominally process, but don't sleep inside an rwlock
ndo_start_xmit:
Synchronization: __netif_tx_lock spinlock.
When the driver sets NETIF_F_LLTX in dev->features this will be
called without holding netif_tx_lock. In this case the driver
has to lock by itself when needed. It is recommended to use a try lock
for this and return NETDEV_TX_LOCKED when the spin lock fails.
The locking there should also properly protect against
set_rx_mode. Note that the use of NETIF_F_LLTX is deprecated.
Don't use it for new drivers.
Context: Process with BHs disabled or BH (timer),
will be called with interrupts disabled by netconsole.
Return codes:
o NETDEV_TX_OK everything ok.
o NETDEV_TX_BUSY Cannot transmit packet, try later
Usually a bug, means queue start/stop flow control is broken in
the driver. Note: the driver must NOT put the skb in its DMA ring.
o NETDEV_TX_LOCKED Locking failed, please retry quickly.
Only valid when NETIF_F_LLTX is set.
ndo_tx_timeout:
Synchronization: netif_tx_lock spinlock; all TX queues frozen.
Context: BHs disabled
Notes: netif_queue_stopped() is guaranteed true
ndo_set_rx_mode:
Synchronization: netif_addr_lock spinlock.
Context: BHs disabled
struct napi_struct synchronization rules
========================================
napi->poll:
Synchronization: NAPI_STATE_SCHED bit in napi->state. Device
driver's ndo_stop method will invoke napi_disable() on
all NAPI instances which will do a sleeping poll on the
NAPI_STATE_SCHED napi->state bit, waiting for all pending
NAPI activity to cease.
Context: softirq
will be called with interrupts disabled by netconsole.

View file

@ -0,0 +1,79 @@
________________
NETIF Msg Level
The design of the network interface message level setting.
History
The design of the debugging message interface was guided and
constrained by backwards compatibility previous practice. It is useful
to understand the history and evolution in order to understand current
practice and relate it to older driver source code.
From the beginning of Linux, each network device driver has had a local
integer variable that controls the debug message level. The message
level ranged from 0 to 7, and monotonically increased in verbosity.
The message level was not precisely defined past level 3, but were
always implemented within +-1 of the specified level. Drivers tended
to shed the more verbose level messages as they matured.
0 Minimal messages, only essential information on fatal errors.
1 Standard messages, initialization status. No run-time messages
2 Special media selection messages, generally timer-driver.
3 Interface starts and stops, including normal status messages
4 Tx and Rx frame error messages, and abnormal driver operation
5 Tx packet queue information, interrupt events.
6 Status on each completed Tx packet and received Rx packets
7 Initial contents of Tx and Rx packets
Initially this message level variable was uniquely named in each driver
e.g. "lance_debug", so that a kernel symbolic debugger could locate and
modify the setting. When kernel modules became common, the variables
were consistently renamed to "debug" and allowed to be set as a module
parameter.
This approach worked well. However there is always a demand for
additional features. Over the years the following emerged as
reasonable and easily implemented enhancements
Using an ioctl() call to modify the level.
Per-interface rather than per-driver message level setting.
More selective control over the type of messages emitted.
The netif_msg recommendation adds these features with only a minor
complexity and code size increase.
The recommendation is the following points
Retaining the per-driver integer variable "debug" as a module
parameter with a default level of '1'.
Adding a per-interface private variable named "msg_enable". The
variable is a bit map rather than a level, and is initialized as
1 << debug
Or more precisely
debug < 0 ? 0 : 1 << min(sizeof(int)-1, debug)
Messages should changes from
if (debug > 1)
printk(MSG_DEBUG "%s: ...
to
if (np->msg_enable & NETIF_MSG_LINK)
printk(MSG_DEBUG "%s: ...
The set of message levels is named
Old level Name Bit position
0 NETIF_MSG_DRV 0x0001
1 NETIF_MSG_PROBE 0x0002
2 NETIF_MSG_LINK 0x0004
2 NETIF_MSG_TIMER 0x0004
3 NETIF_MSG_IFDOWN 0x0008
3 NETIF_MSG_IFUP 0x0008
4 NETIF_MSG_RX_ERR 0x0010
4 NETIF_MSG_TX_ERR 0x0010
5 NETIF_MSG_TX_QUEUED 0x0020
5 NETIF_MSG_INTR 0x0020
6 NETIF_MSG_TX_DONE 0x0040
6 NETIF_MSG_RX_STATUS 0x0040
7 NETIF_MSG_PKTDATA 0x0080

View file

@ -0,0 +1,339 @@
This file documents how to use memory mapped I/O with netlink.
Author: Patrick McHardy <kaber@trash.net>
Overview
--------
Memory mapped netlink I/O can be used to increase throughput and decrease
overhead of unicast receive and transmit operations. Some netlink subsystems
require high throughput, these are mainly the netfilter subsystems
nfnetlink_queue and nfnetlink_log, but it can also help speed up large
dump operations of f.i. the routing database.
Memory mapped netlink I/O used two circular ring buffers for RX and TX which
are mapped into the processes address space.
The RX ring is used by the kernel to directly construct netlink messages into
user-space memory without copying them as done with regular socket I/O,
additionally as long as the ring contains messages no recvmsg() or poll()
syscalls have to be issued by user-space to get more message.
The TX ring is used to process messages directly from user-space memory, the
kernel processes all messages contained in the ring using a single sendmsg()
call.
Usage overview
--------------
In order to use memory mapped netlink I/O, user-space needs three main changes:
- ring setup
- conversion of the RX path to get messages from the ring instead of recvmsg()
- conversion of the TX path to construct messages into the ring
Ring setup is done using setsockopt() to provide the ring parameters to the
kernel, then a call to mmap() to map the ring into the processes address space:
- setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, &params, sizeof(params));
- setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, &params, sizeof(params));
- ring = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0)
Usage of either ring is optional, but even if only the RX ring is used the
mapping still needs to be writable in order to update the frame status after
processing.
Conversion of the reception path involves calling poll() on the file
descriptor, once the socket is readable the frames from the ring are
processed in order until no more messages are available, as indicated by
a status word in the frame header.
On kernel side, in order to make use of memory mapped I/O on receive, the
originating netlink subsystem needs to support memory mapped I/O, otherwise
it will use an allocated socket buffer as usual and the contents will be
copied to the ring on transmission, nullifying most of the performance gains.
Dumps of kernel databases automatically support memory mapped I/O.
Conversion of the transmit path involves changing message construction to
use memory from the TX ring instead of (usually) a buffer declared on the
stack and setting up the frame header appropriately. Optionally poll() can
be used to wait for free frames in the TX ring.
Structured and definitions for using memory mapped I/O are contained in
<linux/netlink.h>.
RX and TX rings
----------------
Each ring contains a number of continuous memory blocks, containing frames of
fixed size dependent on the parameters used for ring setup.
Ring: [ block 0 ]
[ frame 0 ]
[ frame 1 ]
[ block 1 ]
[ frame 2 ]
[ frame 3 ]
...
[ block n ]
[ frame 2 * n ]
[ frame 2 * n + 1 ]
The blocks are only visible to the kernel, from the point of view of user-space
the ring just contains the frames in a continuous memory zone.
The ring parameters used for setting up the ring are defined as follows:
struct nl_mmap_req {
unsigned int nm_block_size;
unsigned int nm_block_nr;
unsigned int nm_frame_size;
unsigned int nm_frame_nr;
};
Frames are grouped into blocks, where each block is a continuous region of memory
and holds nm_block_size / nm_frame_size frames. The total number of frames in
the ring is nm_frame_nr. The following invariants hold:
- frames_per_block = nm_block_size / nm_frame_size
- nm_frame_nr = frames_per_block * nm_block_nr
Some parameters are constrained, specifically:
- nm_block_size must be a multiple of the architectures memory page size.
The getpagesize() function can be used to get the page size.
- nm_frame_size must be equal or larger to NL_MMAP_HDRLEN, IOW a frame must be
able to hold at least the frame header
- nm_frame_size must be smaller or equal to nm_block_size
- nm_frame_size must be a multiple of NL_MMAP_MSG_ALIGNMENT
- nm_frame_nr must equal the actual number of frames as specified above.
When the kernel can't allocate physically continuous memory for a ring block,
it will fall back to use physically discontinuous memory. This might affect
performance negatively, in order to avoid this the nm_frame_size parameter
should be chosen to be as small as possible for the required frame size and
the number of blocks should be increased instead.
Ring frames
------------
Each frames contain a frame header, consisting of a synchronization word and some
meta-data, and the message itself.
Frame: [ header message ]
The frame header is defined as follows:
struct nl_mmap_hdr {
unsigned int nm_status;
unsigned int nm_len;
__u32 nm_group;
/* credentials */
__u32 nm_pid;
__u32 nm_uid;
__u32 nm_gid;
};
- nm_status is used for synchronizing processing between the kernel and user-
space and specifies ownership of the frame as well as the operation to perform
- nm_len contains the length of the message contained in the data area
- nm_group specified the destination multicast group of message
- nm_pid, nm_uid and nm_gid contain the netlink pid, UID and GID of the sending
process. These values correspond to the data available using SOCK_PASSCRED in
the SCM_CREDENTIALS cmsg.
The possible values in the status word are:
- NL_MMAP_STATUS_UNUSED:
RX ring: frame belongs to the kernel and contains no message
for user-space. Approriate action is to invoke poll()
to wait for new messages.
TX ring: frame belongs to user-space and can be used for
message construction.
- NL_MMAP_STATUS_RESERVED:
RX ring only: frame is currently used by the kernel for message
construction and contains no valid message yet.
Appropriate action is to invoke poll() to wait for
new messages.
- NL_MMAP_STATUS_VALID:
RX ring: frame contains a valid message. Approriate action is
to process the message and release the frame back to
the kernel by setting the status to
NL_MMAP_STATUS_UNUSED or queue the frame by setting the
status to NL_MMAP_STATUS_SKIP.
TX ring: the frame contains a valid message from user-space to
be processed by the kernel. After completing processing
the kernel will release the frame back to user-space by
setting the status to NL_MMAP_STATUS_UNUSED.
- NL_MMAP_STATUS_COPY:
RX ring only: a message is ready to be processed but could not be
stored in the ring, either because it exceeded the
frame size or because the originating subsystem does
not support memory mapped I/O. Appropriate action is
to invoke recvmsg() to receive the message and release
the frame back to the kernel by setting the status to
NL_MMAP_STATUS_UNUSED.
- NL_MMAP_STATUS_SKIP:
RX ring only: user-space queued the message for later processing, but
processed some messages following it in the ring. The
kernel should skip this frame when looking for unused
frames.
The data area of a frame begins at a offset of NL_MMAP_HDRLEN relative to the
frame header.
TX limitations
--------------
Kernel processing usually involves validation of the message received by
user-space, then processing its contents. The kernel must assure that
userspace is not able to modify the message contents after they have been
validated. In order to do so, the message is copied from the ring frame
to an allocated buffer if either of these conditions is false:
- only a single mapping of the ring exists
- the file descriptor is not shared between processes
This means that for threaded programs, the kernel will fall back to copying.
Example
-------
Ring setup:
unsigned int block_size = 16 * getpagesize();
struct nl_mmap_req req = {
.nm_block_size = block_size,
.nm_block_nr = 64,
.nm_frame_size = 16384,
.nm_frame_nr = 64 * block_size / 16384,
};
unsigned int ring_size;
void *rx_ring, *tx_ring;
/* Configure ring parameters */
if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, &req, sizeof(req)) < 0)
exit(1);
if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, &req, sizeof(req)) < 0)
exit(1)
/* Calculate size of each individual ring */
ring_size = req.nm_block_nr * req.nm_block_size;
/* Map RX/TX rings. The TX ring is located after the RX ring */
rx_ring = mmap(NULL, 2 * ring_size, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
if ((long)rx_ring == -1L)
exit(1);
tx_ring = rx_ring + ring_size:
Message reception:
This example assumes some ring parameters of the ring setup are available.
unsigned int frame_offset = 0;
struct nl_mmap_hdr *hdr;
struct nlmsghdr *nlh;
unsigned char buf[16384];
ssize_t len;
while (1) {
struct pollfd pfds[1];
pfds[0].fd = fd;
pfds[0].events = POLLIN | POLLERR;
pfds[0].revents = 0;
if (poll(pfds, 1, -1) < 0 && errno != -EINTR)
exit(1);
/* Check for errors. Error handling omitted */
if (pfds[0].revents & POLLERR)
<handle error>
/* If no new messages, poll again */
if (!(pfds[0].revents & POLLIN))
continue;
/* Process all frames */
while (1) {
/* Get next frame header */
hdr = rx_ring + frame_offset;
if (hdr->nm_status == NL_MMAP_STATUS_VALID) {
/* Regular memory mapped frame */
nlh = (void *)hdr + NL_MMAP_HDRLEN;
len = hdr->nm_len;
/* Release empty message immediately. May happen
* on error during message construction.
*/
if (len == 0)
goto release;
} else if (hdr->nm_status == NL_MMAP_STATUS_COPY) {
/* Frame queued to socket receive queue */
len = recv(fd, buf, sizeof(buf), MSG_DONTWAIT);
if (len <= 0)
break;
nlh = buf;
} else
/* No more messages to process, continue polling */
break;
process_msg(nlh);
release:
/* Release frame back to the kernel */
hdr->nm_status = NL_MMAP_STATUS_UNUSED;
/* Advance frame offset to next frame */
frame_offset = (frame_offset + frame_size) % ring_size;
}
}
Message transmission:
This example assumes some ring parameters of the ring setup are available.
A single message is constructed and transmitted, to send multiple messages
at once they would be constructed in consecutive frames before a final call
to sendto().
unsigned int frame_offset = 0;
struct nl_mmap_hdr *hdr;
struct nlmsghdr *nlh;
struct sockaddr_nl addr = {
.nl_family = AF_NETLINK,
};
hdr = tx_ring + frame_offset;
if (hdr->nm_status != NL_MMAP_STATUS_UNUSED)
/* No frame available. Use poll() to avoid. */
exit(1);
nlh = (void *)hdr + NL_MMAP_HDRLEN;
/* Build message */
build_message(nlh);
/* Fill frame header: length and status need to be set */
hdr->nm_len = nlh->nlmsg_len;
hdr->nm_status = NL_MMAP_STATUS_VALID;
if (sendto(fd, NULL, 0, 0, &addr, sizeof(addr)) < 0)
exit(1);
/* Advance frame offset to next frame */
frame_offset = (frame_offset + frame_size) % ring_size;

View file

@ -0,0 +1,176 @@
/proc/sys/net/netfilter/nf_conntrack_* Variables:
nf_conntrack_acct - BOOLEAN
0 - disabled (default)
not 0 - enabled
Enable connection tracking flow accounting. 64-bit byte and packet
counters per flow are added.
nf_conntrack_buckets - INTEGER (read-only)
Size of hash table. If not specified as parameter during module
loading, the default size is calculated by dividing total memory
by 16384 to determine the number of buckets but the hash table will
never have fewer than 32 or more than 16384 buckets.
nf_conntrack_checksum - BOOLEAN
0 - disabled
not 0 - enabled (default)
Verify checksum of incoming packets. Packets with bad checksums are
in INVALID state. If this is enabled, such packets will not be
considered for connection tracking.
nf_conntrack_count - INTEGER (read-only)
Number of currently allocated flow entries.
nf_conntrack_events - BOOLEAN
0 - disabled
not 0 - enabled (default)
If this option is enabled, the connection tracking code will
provide userspace with connection tracking events via ctnetlink.
nf_conntrack_events_retry_timeout - INTEGER (seconds)
default 15
This option is only relevant when "reliable connection tracking
events" are used. Normally, ctnetlink is "lossy", that is,
events are normally dropped when userspace listeners can't keep up.
Userspace can request "reliable event mode". When this mode is
active, the conntrack will only be destroyed after the event was
delivered. If event delivery fails, the kernel periodically
re-tries to send the event to userspace.
This is the maximum interval the kernel should use when re-trying
to deliver the destroy event.
A higher number means there will be fewer delivery retries and it
will take longer for a backlog to be processed.
nf_conntrack_expect_max - INTEGER
Maximum size of expectation table. Default value is
nf_conntrack_buckets / 256. Minimum is 1.
nf_conntrack_frag6_high_thresh - INTEGER
default 262144
Maximum memory used to reassemble IPv6 fragments. When
nf_conntrack_frag6_high_thresh bytes of memory is allocated for this
purpose, the fragment handler will toss packets until
nf_conntrack_frag6_low_thresh is reached.
nf_conntrack_frag6_low_thresh - INTEGER
default 196608
See nf_conntrack_frag6_low_thresh
nf_conntrack_frag6_timeout - INTEGER (seconds)
default 60
Time to keep an IPv6 fragment in memory.
nf_conntrack_generic_timeout - INTEGER (seconds)
default 600
Default for generic timeout. This refers to layer 4 unknown/unsupported
protocols.
nf_conntrack_helper - BOOLEAN
0 - disabled
not 0 - enabled (default)
Enable automatic conntrack helper assignment.
nf_conntrack_icmp_timeout - INTEGER (seconds)
default 30
Default for ICMP timeout.
nf_conntrack_icmpv6_timeout - INTEGER (seconds)
default 30
Default for ICMP6 timeout.
nf_conntrack_log_invalid - INTEGER
0 - disable (default)
1 - log ICMP packets
6 - log TCP packets
17 - log UDP packets
33 - log DCCP packets
41 - log ICMPv6 packets
136 - log UDPLITE packets
255 - log packets of any protocol
Log invalid packets of a type specified by value.
nf_conntrack_max - INTEGER
Size of connection tracking table. Default value is
nf_conntrack_buckets value * 4.
nf_conntrack_tcp_be_liberal - BOOLEAN
0 - disabled (default)
not 0 - enabled
Be conservative in what you do, be liberal in what you accept from others.
If it's non-zero, we mark only out of window RST segments as INVALID.
nf_conntrack_tcp_loose - BOOLEAN
0 - disabled
not 0 - enabled (default)
If it is set to zero, we disable picking up already established
connections.
nf_conntrack_tcp_max_retrans - INTEGER
default 3
Maximum number of packets that can be retransmitted without
received an (acceptable) ACK from the destination. If this number
is reached, a shorter timer will be started.
nf_conntrack_tcp_timeout_close - INTEGER (seconds)
default 10
nf_conntrack_tcp_timeout_close_wait - INTEGER (seconds)
default 60
nf_conntrack_tcp_timeout_established - INTEGER (seconds)
default 432000 (5 days)
nf_conntrack_tcp_timeout_fin_wait - INTEGER (seconds)
default 120
nf_conntrack_tcp_timeout_last_ack - INTEGER (seconds)
default 30
nf_conntrack_tcp_timeout_max_retrans - INTEGER (seconds)
default 300
nf_conntrack_tcp_timeout_syn_recv - INTEGER (seconds)
default 60
nf_conntrack_tcp_timeout_syn_sent - INTEGER (seconds)
default 120
nf_conntrack_tcp_timeout_time_wait - INTEGER (seconds)
default 120
nf_conntrack_tcp_timeout_unacknowledged - INTEGER (seconds)
default 300
nf_conntrack_timestamp - BOOLEAN
0 - disabled (default)
not 0 - enabled
Enable connection tracking flow timestamping.
nf_conntrack_udp_timeout - INTEGER (seconds)
default 30
nf_conntrack_udp_timeout_stream2 - INTEGER (seconds)
default 180
This extended timeout will be used in case there is an UDP stream
detected.

View file

@ -0,0 +1,128 @@
Linux NFC subsystem
===================
The Near Field Communication (NFC) subsystem is required to standardize the
NFC device drivers development and to create an unified userspace interface.
This document covers the architecture overview, the device driver interface
description and the userspace interface description.
Architecture overview
---------------------
The NFC subsystem is responsible for:
- NFC adapters management;
- Polling for targets;
- Low-level data exchange;
The subsystem is divided in some parts. The 'core' is responsible for
providing the device driver interface. On the other side, it is also
responsible for providing an interface to control operations and low-level
data exchange.
The control operations are available to userspace via generic netlink.
The low-level data exchange interface is provided by the new socket family
PF_NFC. The NFC_SOCKPROTO_RAW performs raw communication with NFC targets.
+--------------------------------------+
| USER SPACE |
+--------------------------------------+
^ ^
| low-level | control
| data exchange | operations
| |
| v
| +-----------+
| AF_NFC | netlink |
| socket +-----------+
| raw ^
| |
v v
+---------+ +-----------+
| rawsock | <--------> | core |
+---------+ +-----------+
^
|
v
+-----------+
| driver |
+-----------+
Device Driver Interface
-----------------------
When registering on the NFC subsystem, the device driver must inform the core
of the set of supported NFC protocols and the set of ops callbacks. The ops
callbacks that must be implemented are the following:
* start_poll - setup the device to poll for targets
* stop_poll - stop on progress polling operation
* activate_target - select and initialize one of the targets found
* deactivate_target - deselect and deinitialize the selected target
* data_exchange - send data and receive the response (transceive operation)
Userspace interface
--------------------
The userspace interface is divided in control operations and low-level data
exchange operation.
CONTROL OPERATIONS:
Generic netlink is used to implement the interface to the control operations.
The operations are composed by commands and events, all listed below:
* NFC_CMD_GET_DEVICE - get specific device info or dump the device list
* NFC_CMD_START_POLL - setup a specific device to polling for targets
* NFC_CMD_STOP_POLL - stop the polling operation in a specific device
* NFC_CMD_GET_TARGET - dump the list of targets found by a specific device
* NFC_EVENT_DEVICE_ADDED - reports an NFC device addition
* NFC_EVENT_DEVICE_REMOVED - reports an NFC device removal
* NFC_EVENT_TARGETS_FOUND - reports START_POLL results when 1 or more targets
are found
The user must call START_POLL to poll for NFC targets, passing the desired NFC
protocols through NFC_ATTR_PROTOCOLS attribute. The device remains in polling
state until it finds any target. However, the user can stop the polling
operation by calling STOP_POLL command. In this case, it will be checked if
the requester of STOP_POLL is the same of START_POLL.
If the polling operation finds one or more targets, the event TARGETS_FOUND is
sent (including the device id). The user must call GET_TARGET to get the list of
all targets found by such device. Each reply message has target attributes with
relevant information such as the supported NFC protocols.
All polling operations requested through one netlink socket are stopped when
it's closed.
LOW-LEVEL DATA EXCHANGE:
The userspace must use PF_NFC sockets to perform any data communication with
targets. All NFC sockets use AF_NFC:
struct sockaddr_nfc {
sa_family_t sa_family;
__u32 dev_idx;
__u32 target_idx;
__u32 nfc_protocol;
};
To establish a connection with one target, the user must create an
NFC_SOCKPROTO_RAW socket and call the 'connect' syscall with the sockaddr_nfc
struct correctly filled. All information comes from NFC_EVENT_TARGETS_FOUND
netlink event. As a target can support more than one NFC protocol, the user
must inform which protocol it wants to use.
Internally, 'connect' will result in an activate_target call to the driver.
When the socket is closed, the target is deactivated.
The data format exchanged through the sockets is NFC protocol dependent. For
instance, when communicating with MIFARE tags, the data exchanged are MIFARE
commands and their responses.
The first received package is the response to the first sent package and so
on. In order to allow valid "empty" responses, every data received has a NULL
header of 1 byte.

View file

@ -0,0 +1,235 @@
Open vSwitch datapath developer documentation
=============================================
The Open vSwitch kernel module allows flexible userspace control over
flow-level packet processing on selected network devices. It can be
used to implement a plain Ethernet switch, network device bonding,
VLAN processing, network access control, flow-based network control,
and so on.
The kernel module implements multiple "datapaths" (analogous to
bridges), each of which can have multiple "vports" (analogous to ports
within a bridge). Each datapath also has associated with it a "flow
table" that userspace populates with "flows" that map from keys based
on packet headers and metadata to sets of actions. The most common
action forwards the packet to another vport; other actions are also
implemented.
When a packet arrives on a vport, the kernel module processes it by
extracting its flow key and looking it up in the flow table. If there
is a matching flow, it executes the associated actions. If there is
no match, it queues the packet to userspace for processing (as part of
its processing, userspace will likely set up a flow to handle further
packets of the same type entirely in-kernel).
Flow key compatibility
----------------------
Network protocols evolve over time. New protocols become important
and existing protocols lose their prominence. For the Open vSwitch
kernel module to remain relevant, it must be possible for newer
versions to parse additional protocols as part of the flow key. It
might even be desirable, someday, to drop support for parsing
protocols that have become obsolete. Therefore, the Netlink interface
to Open vSwitch is designed to allow carefully written userspace
applications to work with any version of the flow key, past or future.
To support this forward and backward compatibility, whenever the
kernel module passes a packet to userspace, it also passes along the
flow key that it parsed from the packet. Userspace then extracts its
own notion of a flow key from the packet and compares it against the
kernel-provided version:
- If userspace's notion of the flow key for the packet matches the
kernel's, then nothing special is necessary.
- If the kernel's flow key includes more fields than the userspace
version of the flow key, for example if the kernel decoded IPv6
headers but userspace stopped at the Ethernet type (because it
does not understand IPv6), then again nothing special is
necessary. Userspace can still set up a flow in the usual way,
as long as it uses the kernel-provided flow key to do it.
- If the userspace flow key includes more fields than the
kernel's, for example if userspace decoded an IPv6 header but
the kernel stopped at the Ethernet type, then userspace can
forward the packet manually, without setting up a flow in the
kernel. This case is bad for performance because every packet
that the kernel considers part of the flow must go to userspace,
but the forwarding behavior is correct. (If userspace can
determine that the values of the extra fields would not affect
forwarding behavior, then it could set up a flow anyway.)
How flow keys evolve over time is important to making this work, so
the following sections go into detail.
Flow key format
---------------
A flow key is passed over a Netlink socket as a sequence of Netlink
attributes. Some attributes represent packet metadata, defined as any
information about a packet that cannot be extracted from the packet
itself, e.g. the vport on which the packet was received. Most
attributes, however, are extracted from headers within the packet,
e.g. source and destination addresses from Ethernet, IP, or TCP
headers.
The <linux/openvswitch.h> header file defines the exact format of the
flow key attributes. For informal explanatory purposes here, we write
them as comma-separated strings, with parentheses indicating arguments
and nesting. For example, the following could represent a flow key
corresponding to a TCP packet that arrived on vport 1:
in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4),
eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0,
frag=no), tcp(src=49163, dst=80)
Often we ellipsize arguments not important to the discussion, e.g.:
in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...)
Wildcarded flow key format
--------------------------
A wildcarded flow is described with two sequences of Netlink attributes
passed over the Netlink socket. A flow key, exactly as described above, and an
optional corresponding flow mask.
A wildcarded flow can represent a group of exact match flows. Each '1' bit
in the mask specifies a exact match with the corresponding bit in the flow key.
A '0' bit specifies a don't care bit, which will match either a '1' or '0' bit
of a incoming packet. Using wildcarded flow can improve the flow set up rate
by reduce the number of new flows need to be processed by the user space program.
Support for the mask Netlink attribute is optional for both the kernel and user
space program. The kernel can ignore the mask attribute, installing an exact
match flow, or reduce the number of don't care bits in the kernel to less than
what was specified by the user space program. In this case, variations in bits
that the kernel does not implement will simply result in additional flow setups.
The kernel module will also work with user space programs that neither support
nor supply flow mask attributes.
Since the kernel may ignore or modify wildcard bits, it can be difficult for
the userspace program to know exactly what matches are installed. There are
two possible approaches: reactively install flows as they miss the kernel
flow table (and therefore not attempt to determine wildcard changes at all)
or use the kernel's response messages to determine the installed wildcards.
When interacting with userspace, the kernel should maintain the match portion
of the key exactly as originally installed. This will provides a handle to
identify the flow for all future operations. However, when reporting the
mask of an installed flow, the mask should include any restrictions imposed
by the kernel.
The behavior when using overlapping wildcarded flows is undefined. It is the
responsibility of the user space program to ensure that any incoming packet
can match at most one flow, wildcarded or not. The current implementation
performs best-effort detection of overlapping wildcarded flows and may reject
some but not all of them. However, this behavior may change in future versions.
Basic rule for evolving flow keys
---------------------------------
Some care is needed to really maintain forward and backward
compatibility for applications that follow the rules listed under
"Flow key compatibility" above.
The basic rule is obvious:
------------------------------------------------------------------
New network protocol support must only supplement existing flow
key attributes. It must not change the meaning of already defined
flow key attributes.
------------------------------------------------------------------
This rule does have less-obvious consequences so it is worth working
through a few examples. Suppose, for example, that the kernel module
did not already implement VLAN parsing. Instead, it just interpreted
the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the
packet. The flow key for any packet with an 802.1Q header would look
essentially like this, ignoring metadata:
eth(...), eth_type(0x8100)
Naively, to add VLAN support, it makes sense to add a new "vlan" flow
key attribute to contain the VLAN tag, then continue to decode the
encapsulated headers beyond the VLAN tag using the existing field
definitions. With this change, a TCP packet in VLAN 10 would have a
flow key much like this:
eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...)
But this change would negatively affect a userspace application that
has not been updated to understand the new "vlan" flow key attribute.
The application could, following the flow compatibility rules above,
ignore the "vlan" attribute that it does not understand and therefore
assume that the flow contained IP packets. This is a bad assumption
(the flow only contains IP packets if one parses and skips over the
802.1Q header) and it could cause the application's behavior to change
across kernel versions even though it follows the compatibility rules.
The solution is to use a set of nested attributes. This is, for
example, why 802.1Q support uses nested attributes. A TCP packet in
VLAN 10 is actually expressed as:
eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800),
ip(proto=6, ...), tcp(...)))
Notice how the "eth_type", "ip", and "tcp" flow key attributes are
nested inside the "encap" attribute. Thus, an application that does
not understand the "vlan" key will not see either of those attributes
and therefore will not misinterpret them. (Also, the outer eth_type
is still 0x8100, not changed to 0x0800.)
Handling malformed packets
--------------------------
Don't drop packets in the kernel for malformed protocol headers, bad
checksums, etc. This would prevent userspace from implementing a
simple Ethernet switch that forwards every packet.
Instead, in such a case, include an attribute with "empty" content.
It doesn't matter if the empty content could be valid protocol values,
as long as those values are rarely seen in practice, because userspace
can always forward all packets with those values to userspace and
handle them individually.
For example, consider a packet that contains an IP header that
indicates protocol 6 for TCP, but which is truncated just after the IP
header, so that the TCP header is missing. The flow key for this
packet would include a tcp attribute with all-zero src and dst, like
this:
eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
As another example, consider a packet with an Ethernet type of 0x8100,
indicating that a VLAN TCI should follow, but which is truncated just
after the Ethernet type. The flow key for this packet would include
an all-zero-bits vlan and an empty encap attribute, like this:
eth(...), eth_type(0x8100), vlan(0), encap()
Unlike a TCP packet with source and destination ports 0, an
all-zero-bits VLAN TCI is not that rare, so the CFI bit (aka
VLAN_TAG_PRESENT inside the kernel) is ordinarily set in a vlan
attribute expressly to allow this situation to be distinguished.
Thus, the flow key in this second example unambiguously indicates a
missing or malformed VLAN TCI.
Other rules
-----------
The other rules for flow keys are much less subtle:
- Duplicate attributes are not allowed at a given nesting level.
- Ordering of attributes is not significant.
- When the kernel sends a given flow key to userspace, it always
composes it the same way. This allows userspace to hash and
compare entire flow keys that it may not be able to fully
interpret.

View file

@ -0,0 +1,162 @@
1. Introduction
Linux distinguishes between administrative and operational state of an
interface. Administrative state is the result of "ip link set dev
<dev> up or down" and reflects whether the administrator wants to use
the device for traffic.
However, an interface is not usable just because the admin enabled it
- ethernet requires to be plugged into the switch and, depending on
a site's networking policy and configuration, an 802.1X authentication
to be performed before user data can be transferred. Operational state
shows the ability of an interface to transmit this user data.
Thanks to 802.1X, userspace must be granted the possibility to
influence operational state. To accommodate this, operational state is
split into two parts: Two flags that can be set by the driver only, and
a RFC2863 compatible state that is derived from these flags, a policy,
and changeable from userspace under certain rules.
2. Querying from userspace
Both admin and operational state can be queried via the netlink
operation RTM_GETLINK. It is also possible to subscribe to RTMGRP_LINK
to be notified of updates. This is important for setting from userspace.
These values contain interface state:
ifinfomsg::if_flags & IFF_UP:
Interface is admin up
ifinfomsg::if_flags & IFF_RUNNING:
Interface is in RFC2863 operational state UP or UNKNOWN. This is for
backward compatibility, routing daemons, dhcp clients can use this
flag to determine whether they should use the interface.
ifinfomsg::if_flags & IFF_LOWER_UP:
Driver has signaled netif_carrier_on()
ifinfomsg::if_flags & IFF_DORMANT:
Driver has signaled netif_dormant_on()
TLV IFLA_OPERSTATE
contains RFC2863 state of the interface in numeric representation:
IF_OPER_UNKNOWN (0):
Interface is in unknown state, neither driver nor userspace has set
operational state. Interface must be considered for user data as
setting operational state has not been implemented in every driver.
IF_OPER_NOTPRESENT (1):
Unused in current kernel (notpresent interfaces normally disappear),
just a numerical placeholder.
IF_OPER_DOWN (2):
Interface is unable to transfer data on L1, f.e. ethernet is not
plugged or interface is ADMIN down.
IF_OPER_LOWERLAYERDOWN (3):
Interfaces stacked on an interface that is IF_OPER_DOWN show this
state (f.e. VLAN).
IF_OPER_TESTING (4):
Unused in current kernel.
IF_OPER_DORMANT (5):
Interface is L1 up, but waiting for an external event, f.e. for a
protocol to establish. (802.1X)
IF_OPER_UP (6):
Interface is operational up and can be used.
This TLV can also be queried via sysfs.
TLV IFLA_LINKMODE
contains link policy. This is needed for userspace interaction
described below.
This TLV can also be queried via sysfs.
3. Kernel driver API
Kernel drivers have access to two flags that map to IFF_LOWER_UP and
IFF_DORMANT. These flags can be set from everywhere, even from
interrupts. It is guaranteed that only the driver has write access,
however, if different layers of the driver manipulate the same flag,
the driver has to provide the synchronisation needed.
__LINK_STATE_NOCARRIER, maps to !IFF_LOWER_UP:
The driver uses netif_carrier_on() to clear and netif_carrier_off() to
set this flag. On netif_carrier_off(), the scheduler stops sending
packets. The name 'carrier' and the inversion are historical, think of
it as lower layer.
Note that for certain kind of soft-devices, which are not managing any
real hardware, it is possible to set this bit from userspace. One
should use TVL IFLA_CARRIER to do so.
netif_carrier_ok() can be used to query that bit.
__LINK_STATE_DORMANT, maps to IFF_DORMANT:
Set by the driver to express that the device cannot yet be used
because some driver controlled protocol establishment has to
complete. Corresponding functions are netif_dormant_on() to set the
flag, netif_dormant_off() to clear it and netif_dormant() to query.
On device allocation, networking core sets the flags equivalent to
netif_carrier_ok() and !netif_dormant().
Whenever the driver CHANGES one of these flags, a workqueue event is
scheduled to translate the flag combination to IFLA_OPERSTATE as
follows:
!netif_carrier_ok():
IF_OPER_LOWERLAYERDOWN if the interface is stacked, IF_OPER_DOWN
otherwise. Kernel can recognise stacked interfaces because their
ifindex != iflink.
netif_carrier_ok() && netif_dormant():
IF_OPER_DORMANT
netif_carrier_ok() && !netif_dormant():
IF_OPER_UP if userspace interaction is disabled. Otherwise
IF_OPER_DORMANT with the possibility for userspace to initiate the
IF_OPER_UP transition afterwards.
4. Setting from userspace
Applications have to use the netlink interface to influence the
RFC2863 operational state of an interface. Setting IFLA_LINKMODE to 1
via RTM_SETLINK instructs the kernel that an interface should go to
IF_OPER_DORMANT instead of IF_OPER_UP when the combination
netif_carrier_ok() && !netif_dormant() is set by the
driver. Afterwards, the userspace application can set IFLA_OPERSTATE
to IF_OPER_DORMANT or IF_OPER_UP as long as the driver does not set
netif_carrier_off() or netif_dormant_on(). Changes made by userspace
are multicasted on the netlink group RTMGRP_LINK.
So basically a 802.1X supplicant interacts with the kernel like this:
-subscribe to RTMGRP_LINK
-set IFLA_LINKMODE to 1 via RTM_SETLINK
-query RTM_GETLINK once to get initial state
-if initial flags are not (IFF_LOWER_UP && !IFF_DORMANT), wait until
netlink multicast signals this state
-do 802.1X, eventually abort if flags go down again
-send RTM_SETLINK to set operstate to IF_OPER_UP if authentication
succeeds, IF_OPER_DORMANT otherwise
-see how operstate and IFF_RUNNING is echoed via netlink multicast
-set interface back to IF_OPER_DORMANT if 802.1X reauthentication
fails
-restart if kernel changes IFF_LOWER_UP or IFF_DORMANT flag
if supplicant goes down, bring back IFLA_LINKMODE to 0 and
IFLA_OPERSTATE to a sane value.
A routing daemon or dhcp client just needs to care for IFF_RUNNING or
waiting for operstate to go IF_OPER_UP/IF_OPER_UNKNOWN before
considering the interface / querying a DHCP address.
For technical questions and/or comments please e-mail to Stefan Rompf
(stefan at loplof.de).

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,214 @@
Linux Phonet protocol family
============================
Introduction
------------
Phonet is a packet protocol used by Nokia cellular modems for both IPC
and RPC. With the Linux Phonet socket family, Linux host processes can
receive and send messages from/to the modem, or any other external
device attached to the modem. The modem takes care of routing.
Phonet packets can be exchanged through various hardware connections
depending on the device, such as:
- USB with the CDC Phonet interface,
- infrared,
- Bluetooth,
- an RS232 serial port (with a dedicated "FBUS" line discipline),
- the SSI bus with some TI OMAP processors.
Packets format
--------------
Phonet packets have a common header as follows:
struct phonethdr {
uint8_t pn_media; /* Media type (link-layer identifier) */
uint8_t pn_rdev; /* Receiver device ID */
uint8_t pn_sdev; /* Sender device ID */
uint8_t pn_res; /* Resource ID or function */
uint16_t pn_length; /* Big-endian message byte length (minus 6) */
uint8_t pn_robj; /* Receiver object ID */
uint8_t pn_sobj; /* Sender object ID */
};
On Linux, the link-layer header includes the pn_media byte (see below).
The next 7 bytes are part of the network-layer header.
The device ID is split: the 6 higher-order bits constitute the device
address, while the 2 lower-order bits are used for multiplexing, as are
the 8-bit object identifiers. As such, Phonet can be considered as a
network layer with 6 bits of address space and 10 bits for transport
protocol (much like port numbers in IP world).
The modem always has address number zero. All other device have a their
own 6-bit address.
Link layer
----------
Phonet links are always point-to-point links. The link layer header
consists of a single Phonet media type byte. It uniquely identifies the
link through which the packet is transmitted, from the modem's
perspective. Each Phonet network device shall prepend and set the media
type byte as appropriate. For convenience, a common phonet_header_ops
link-layer header operations structure is provided. It sets the
media type according to the network device hardware address.
Linux Phonet network interfaces support a dedicated link layer packets
type (ETH_P_PHONET) which is out of the Ethernet type range. They can
only send and receive Phonet packets.
The virtual TUN tunnel device driver can also be used for Phonet. This
requires IFF_TUN mode, _without_ the IFF_NO_PI flag. In this case,
there is no link-layer header, so there is no Phonet media type byte.
Note that Phonet interfaces are not allowed to re-order packets, so
only the (default) Linux FIFO qdisc should be used with them.
Network layer
-------------
The Phonet socket address family maps the Phonet packet header:
struct sockaddr_pn {
sa_family_t spn_family; /* AF_PHONET */
uint8_t spn_obj; /* Object ID */
uint8_t spn_dev; /* Device ID */
uint8_t spn_resource; /* Resource or function */
uint8_t spn_zero[...]; /* Padding */
};
The resource field is only used when sending and receiving;
It is ignored by bind() and getsockname().
Low-level datagram protocol
---------------------------
Applications can send Phonet messages using the Phonet datagram socket
protocol from the PF_PHONET family. Each socket is bound to one of the
2^10 object IDs available, and can send and receive packets with any
other peer.
struct sockaddr_pn addr = { .spn_family = AF_PHONET, };
ssize_t len;
socklen_t addrlen = sizeof(addr);
int fd;
fd = socket(PF_PHONET, SOCK_DGRAM, 0);
bind(fd, (struct sockaddr *)&addr, sizeof(addr));
/* ... */
sendto(fd, msg, msglen, 0, (struct sockaddr *)&addr, sizeof(addr));
len = recvfrom(fd, buf, sizeof(buf), 0,
(struct sockaddr *)&addr, &addrlen);
This protocol follows the SOCK_DGRAM connection-less semantics.
However, connect() and getpeername() are not supported, as they did
not seem useful with Phonet usages (could be added easily).
Resource subscription
---------------------
A Phonet datagram socket can be subscribed to any number of 8-bits
Phonet resources, as follow:
uint32_t res = 0xXX;
ioctl(fd, SIOCPNADDRESOURCE, &res);
Subscription is similarly cancelled using the SIOCPNDELRESOURCE I/O
control request, or when the socket is closed.
Note that no more than one socket can be subcribed to any given
resource at a time. If not, ioctl() will return EBUSY.
Phonet Pipe protocol
--------------------
The Phonet Pipe protocol is a simple sequenced packets protocol
with end-to-end congestion control. It uses the passive listening
socket paradigm. The listening socket is bound to an unique free object
ID. Each listening socket can handle up to 255 simultaneous
connections, one per accept()'d socket.
int lfd, cfd;
lfd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE);
listen (lfd, INT_MAX);
/* ... */
cfd = accept(lfd, NULL, NULL);
for (;;)
{
char buf[...];
ssize_t len = read(cfd, buf, sizeof(buf));
/* ... */
write(cfd, msg, msglen);
}
Connections are traditionally established between two endpoints by a
"third party" application. This means that both endpoints are passive.
As of Linux kernel version 2.6.39, it is also possible to connect
two endpoints directly, using connect() on the active side. This is
intended to support the newer Nokia Wireless Modem API, as found in
e.g. the Nokia Slim Modem in the ST-Ericsson U8500 platform:
struct sockaddr_spn spn;
int fd;
fd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE);
memset(&spn, 0, sizeof(spn));
spn.spn_family = AF_PHONET;
spn.spn_obj = ...;
spn.spn_dev = ...;
spn.spn_resource = 0xD9;
connect(fd, (struct sockaddr *)&spn, sizeof(spn));
/* normal I/O here ... */
close(fd);
WARNING:
When polling a connected pipe socket for writability, there is an
intrinsic race condition whereby writability might be lost between the
polling and the writing system calls. In this case, the socket will
block until write becomes possible again, unless non-blocking mode
is enabled.
The pipe protocol provides two socket options at the SOL_PNPIPE level:
PNPIPE_ENCAP accepts one integer value (int) of:
PNPIPE_ENCAP_NONE: The socket operates normally (default).
PNPIPE_ENCAP_IP: The socket is used as a backend for a virtual IP
interface. This requires CAP_NET_ADMIN capability. GPRS data
support on Nokia modems can use this. Note that the socket cannot
be reliably poll()'d or read() from while in this mode.
PNPIPE_IFINDEX is a read-only integer value. It contains the
interface index of the network interface created by PNPIPE_ENCAP,
or zero if encapsulation is off.
PNPIPE_HANDLE is a read-only integer value. It contains the underlying
identifier ("pipe handle") of the pipe. This is only defined for
socket descriptors that are already connected or being connected.
Authors
-------
Linux Phonet was initially written by Sakari Ailus.
Other contributors include Mikä Liljeberg, Andras Domokos,
Carlos Chinea and Rémi Denis-Courmont.
Copyright (C) 2008 Nokia Corporation.

View file

@ -0,0 +1,339 @@
-------
PHY Abstraction Layer
(Updated 2008-04-08)
Purpose
Most network devices consist of set of registers which provide an interface
to a MAC layer, which communicates with the physical connection through a
PHY. The PHY concerns itself with negotiating link parameters with the link
partner on the other side of the network connection (typically, an ethernet
cable), and provides a register interface to allow drivers to determine what
settings were chosen, and to configure what settings are allowed.
While these devices are distinct from the network devices, and conform to a
standard layout for the registers, it has been common practice to integrate
the PHY management code with the network driver. This has resulted in large
amounts of redundant code. Also, on embedded systems with multiple (and
sometimes quite different) ethernet controllers connected to the same
management bus, it is difficult to ensure safe use of the bus.
Since the PHYs are devices, and the management busses through which they are
accessed are, in fact, busses, the PHY Abstraction Layer treats them as such.
In doing so, it has these goals:
1) Increase code-reuse
2) Increase overall code-maintainability
3) Speed development time for new network drivers, and for new systems
Basically, this layer is meant to provide an interface to PHY devices which
allows network driver writers to write as little code as possible, while
still providing a full feature set.
The MDIO bus
Most network devices are connected to a PHY by means of a management bus.
Different devices use different busses (though some share common interfaces).
In order to take advantage of the PAL, each bus interface needs to be
registered as a distinct device.
1) read and write functions must be implemented. Their prototypes are:
int write(struct mii_bus *bus, int mii_id, int regnum, u16 value);
int read(struct mii_bus *bus, int mii_id, int regnum);
mii_id is the address on the bus for the PHY, and regnum is the register
number. These functions are guaranteed not to be called from interrupt
time, so it is safe for them to block, waiting for an interrupt to signal
the operation is complete
2) A reset function is optional. This is used to return the bus to an
initialized state.
3) A probe function is needed. This function should set up anything the bus
driver needs, setup the mii_bus structure, and register with the PAL using
mdiobus_register. Similarly, there's a remove function to undo all of
that (use mdiobus_unregister).
4) Like any driver, the device_driver structure must be configured, and init
exit functions are used to register the driver.
5) The bus must also be declared somewhere as a device, and registered.
As an example for how one driver implemented an mdio bus driver, see
drivers/net/ethernet/freescale/fsl_pq_mdio.c and an associated DTS file
for one of the users. (e.g. "git grep fsl,.*-mdio arch/powerpc/boot/dts/")
Connecting to a PHY
Sometime during startup, the network driver needs to establish a connection
between the PHY device, and the network device. At this time, the PHY's bus
and drivers need to all have been loaded, so it is ready for the connection.
At this point, there are several ways to connect to the PHY:
1) The PAL handles everything, and only calls the network driver when
the link state changes, so it can react.
2) The PAL handles everything except interrupts (usually because the
controller has the interrupt registers).
3) The PAL handles everything, but checks in with the driver every second,
allowing the network driver to react first to any changes before the PAL
does.
4) The PAL serves only as a library of functions, with the network device
manually calling functions to update status, and configure the PHY
Letting the PHY Abstraction Layer do Everything
If you choose option 1 (The hope is that every driver can, but to still be
useful to drivers that can't), connecting to the PHY is simple:
First, you need a function to react to changes in the link state. This
function follows this protocol:
static void adjust_link(struct net_device *dev);
Next, you need to know the device name of the PHY connected to this device.
The name will look something like, "0:00", where the first number is the
bus id, and the second is the PHY's address on that bus. Typically,
the bus is responsible for making its ID unique.
Now, to connect, just call this function:
phydev = phy_connect(dev, phy_name, &adjust_link, interface);
phydev is a pointer to the phy_device structure which represents the PHY. If
phy_connect is successful, it will return the pointer. dev, here, is the
pointer to your net_device. Once done, this function will have started the
PHY's software state machine, and registered for the PHY's interrupt, if it
has one. The phydev structure will be populated with information about the
current state, though the PHY will not yet be truly operational at this
point.
PHY-specific flags should be set in phydev->dev_flags prior to the call
to phy_connect() such that the underlying PHY driver can check for flags
and perform specific operations based on them.
This is useful if the system has put hardware restrictions on
the PHY/controller, of which the PHY needs to be aware.
interface is a u32 which specifies the connection type used
between the controller and the PHY. Examples are GMII, MII,
RGMII, and SGMII. For a full list, see include/linux/phy.h
Now just make sure that phydev->supported and phydev->advertising have any
values pruned from them which don't make sense for your controller (a 10/100
controller may be connected to a gigabit capable PHY, so you would need to
mask off SUPPORTED_1000baseT*). See include/linux/ethtool.h for definitions
for these bitfields. Note that you should not SET any bits, or the PHY may
get put into an unsupported state.
Lastly, once the controller is ready to handle network traffic, you call
phy_start(phydev). This tells the PAL that you are ready, and configures the
PHY to connect to the network. If you want to handle your own interrupts,
just set phydev->irq to PHY_IGNORE_INTERRUPT before you call phy_start.
Similarly, if you don't want to use interrupts, set phydev->irq to PHY_POLL.
When you want to disconnect from the network (even if just briefly), you call
phy_stop(phydev).
Keeping Close Tabs on the PAL
It is possible that the PAL's built-in state machine needs a little help to
keep your network device and the PHY properly in sync. If so, you can
register a helper function when connecting to the PHY, which will be called
every second before the state machine reacts to any changes. To do this, you
need to manually call phy_attach() and phy_prepare_link(), and then call
phy_start_machine() with the second argument set to point to your special
handler.
Currently there are no examples of how to use this functionality, and testing
on it has been limited because the author does not have any drivers which use
it (they all use option 1). So Caveat Emptor.
Doing it all yourself
There's a remote chance that the PAL's built-in state machine cannot track
the complex interactions between the PHY and your network device. If this is
so, you can simply call phy_attach(), and not call phy_start_machine or
phy_prepare_link(). This will mean that phydev->state is entirely yours to
handle (phy_start and phy_stop toggle between some of the states, so you
might need to avoid them).
An effort has been made to make sure that useful functionality can be
accessed without the state-machine running, and most of these functions are
descended from functions which did not interact with a complex state-machine.
However, again, no effort has been made so far to test running without the
state machine, so tryer beware.
Here is a brief rundown of the functions:
int phy_read(struct phy_device *phydev, u16 regnum);
int phy_write(struct phy_device *phydev, u16 regnum, u16 val);
Simple read/write primitives. They invoke the bus's read/write function
pointers.
void phy_print_status(struct phy_device *phydev);
A convenience function to print out the PHY status neatly.
int phy_start_interrupts(struct phy_device *phydev);
int phy_stop_interrupts(struct phy_device *phydev);
Requests the IRQ for the PHY interrupts, then enables them for
start, or disables then frees them for stop.
struct phy_device * phy_attach(struct net_device *dev, const char *phy_id,
phy_interface_t interface);
Attaches a network device to a particular PHY, binding the PHY to a generic
driver if none was found during bus initialization.
int phy_start_aneg(struct phy_device *phydev);
Using variables inside the phydev structure, either configures advertising
and resets autonegotiation, or disables autonegotiation, and configures
forced settings.
static inline int phy_read_status(struct phy_device *phydev);
Fills the phydev structure with up-to-date information about the current
settings in the PHY.
int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd);
int phy_ethtool_gset(struct phy_device *phydev, struct ethtool_cmd *cmd);
Ethtool convenience functions.
int phy_mii_ioctl(struct phy_device *phydev,
struct mii_ioctl_data *mii_data, int cmd);
The MII ioctl. Note that this function will completely screw up the state
machine if you write registers like BMCR, BMSR, ADVERTISE, etc. Best to
use this only to write registers which are not standard, and don't set off
a renegotiation.
PHY Device Drivers
With the PHY Abstraction Layer, adding support for new PHYs is
quite easy. In some cases, no work is required at all! However,
many PHYs require a little hand-holding to get up-and-running.
Generic PHY driver
If the desired PHY doesn't have any errata, quirks, or special
features you want to support, then it may be best to not add
support, and let the PHY Abstraction Layer's Generic PHY Driver
do all of the work.
Writing a PHY driver
If you do need to write a PHY driver, the first thing to do is
make sure it can be matched with an appropriate PHY device.
This is done during bus initialization by reading the device's
UID (stored in registers 2 and 3), then comparing it to each
driver's phy_id field by ANDing it with each driver's
phy_id_mask field. Also, it needs a name. Here's an example:
static struct phy_driver dm9161_driver = {
.phy_id = 0x0181b880,
.name = "Davicom DM9161E",
.phy_id_mask = 0x0ffffff0,
...
}
Next, you need to specify what features (speed, duplex, autoneg,
etc) your PHY device and driver support. Most PHYs support
PHY_BASIC_FEATURES, but you can look in include/mii.h for other
features.
Each driver consists of a number of function pointers:
soft_reset: perform a PHY software reset
config_init: configures PHY into a sane state after a reset.
For instance, a Davicom PHY requires descrambling disabled.
probe: Allocate phy->priv, optionally refuse to bind.
PHY may not have been reset or had fixups run yet.
suspend/resume: power management
config_aneg: Changes the speed/duplex/negotiation settings
aneg_done: Determines the auto-negotiation result
read_status: Reads the current speed/duplex/negotiation settings
ack_interrupt: Clear a pending interrupt
did_interrupt: Checks if the PHY generated an interrupt
config_intr: Enable or disable interrupts
remove: Does any driver take-down
ts_info: Queries about the HW timestamping status
hwtstamp: Set the PHY HW timestamping configuration
rxtstamp: Requests a receive timestamp at the PHY level for a 'skb'
txtsamp: Requests a transmit timestamp at the PHY level for a 'skb'
set_wol: Enable Wake-on-LAN at the PHY level
get_wol: Get the Wake-on-LAN status at the PHY level
read_mmd_indirect: Read PHY MMD indirect register
write_mmd_indirect: Write PHY MMD indirect register
Of these, only config_aneg and read_status are required to be
assigned by the driver code. The rest are optional. Also, it is
preferred to use the generic phy driver's versions of these two
functions if at all possible: genphy_read_status and
genphy_config_aneg. If this is not possible, it is likely that
you only need to perform some actions before and after invoking
these functions, and so your functions will wrap the generic
ones.
Feel free to look at the Marvell, Cicada, and Davicom drivers in
drivers/net/phy/ for examples (the lxt and qsemi drivers have
not been tested as of this writing).
The PHY's MMD register accesses are handled by the PAL framework
by default, but can be overridden by a specific PHY driver if
required. This could be the case if a PHY was released for
manufacturing before the MMD PHY register definitions were
standardized by the IEEE. Most modern PHYs will be able to use
the generic PAL framework for accessing the PHY's MMD registers.
An example of such usage is for Energy Efficient Ethernet support,
implemented in the PAL. This support uses the PAL to access MMD
registers for EEE query and configuration if the PHY supports
the IEEE standard access mechanisms, or can use the PHY's specific
access interfaces if overridden by the specific PHY driver. See
the Micrel driver in drivers/net/phy/ for an example of how this
can be implemented.
Board Fixups
Sometimes the specific interaction between the platform and the PHY requires
special handling. For instance, to change where the PHY's clock input is,
or to add a delay to account for latency issues in the data path. In order
to support such contingencies, the PHY Layer allows platform code to register
fixups to be run when the PHY is brought up (or subsequently reset).
When the PHY Layer brings up a PHY it checks to see if there are any fixups
registered for it, matching based on UID (contained in the PHY device's phy_id
field) and the bus identifier (contained in phydev->dev.bus_id). Both must
match, however two constants, PHY_ANY_ID and PHY_ANY_UID, are provided as
wildcards for the bus ID and UID, respectively.
When a match is found, the PHY layer will invoke the run function associated
with the fixup. This function is passed a pointer to the phy_device of
interest. It should therefore only operate on that PHY.
The platform code can either register the fixup using phy_register_fixup():
int phy_register_fixup(const char *phy_id,
u32 phy_uid, u32 phy_uid_mask,
int (*run)(struct phy_device *));
Or using one of the two stubs, phy_register_fixup_for_uid() and
phy_register_fixup_for_id():
int phy_register_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask,
int (*run)(struct phy_device *));
int phy_register_fixup_for_id(const char *phy_id,
int (*run)(struct phy_device *));
The stubs set one of the two matching criteria, and set the other one to
match anything.

View file

@ -0,0 +1,321 @@
HOWTO for the linux packet generator
------------------------------------
Date: 041221
Enable CONFIG_NET_PKTGEN to compile and build pktgen.o either in kernel
or as module. Module is preferred. insmod pktgen if needed. Once running
pktgen creates a thread on each CPU where each thread has affinity to its CPU.
Monitoring and controlling is done via /proc. Easiest to select a suitable
a sample script and configure.
On a dual CPU:
ps aux | grep pkt
root 129 0.3 0.0 0 0 ? SW 2003 523:20 [pktgen/0]
root 130 0.3 0.0 0 0 ? SW 2003 509:50 [pktgen/1]
For monitoring and control pktgen creates:
/proc/net/pktgen/pgctrl
/proc/net/pktgen/kpktgend_X
/proc/net/pktgen/ethX
Tuning NIC for max performance
==============================
The default NIC setting are (likely) not tuned for pktgen's artificial
overload type of benchmarking, as this could hurt the normal use-case.
Specifically increasing the TX ring buffer in the NIC:
# ethtool -G ethX tx 1024
A larger TX ring can improve pktgen's performance, while it can hurt
in the general case, 1) because the TX ring buffer might get larger
than the CPUs L1/L2 cache, 2) because it allow more queueing in the
NIC HW layer (which is bad for bufferbloat).
One should be careful to conclude, that packets/descriptors in the HW
TX ring cause delay. Drivers usually delay cleaning up the
ring-buffers (for various performance reasons), thus packets stalling
the TX ring, might just be waiting for cleanup.
This cleanup issues is specifically the case, for the driver ixgbe
(Intel 82599 chip). This driver (ixgbe) combine TX+RX ring cleanups,
and the cleanup interval is affected by the ethtool --coalesce setting
of parameter "rx-usecs".
For ixgbe use e.g "30" resulting in approx 33K interrupts/sec (1/30*10^6):
# ethtool -C ethX rx-usecs 30
Viewing threads
===============
/proc/net/pktgen/kpktgend_0
Name: kpktgend_0 max_before_softirq: 10000
Running:
Stopped: eth1
Result: OK: max_before_softirq=10000
Most important the devices assigned to thread. Note! A device can only belong
to one thread.
Viewing devices
===============
Parm section holds configured info. Current hold running stats.
Result is printed after run or after interruption. Example:
/proc/net/pktgen/eth1
Params: count 10000000 min_pkt_size: 60 max_pkt_size: 60
frags: 0 delay: 0 clone_skb: 1000000 ifname: eth1
flows: 0 flowlen: 0
dst_min: 10.10.11.2 dst_max:
src_min: src_max:
src_mac: 00:00:00:00:00:00 dst_mac: 00:04:23:AC:FD:82
udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9
src_mac_count: 0 dst_mac_count: 0
Flags:
Current:
pkts-sofar: 10000000 errors: 39664
started: 1103053986245187us stopped: 1103053999346329us idle: 880401us
seq_num: 10000011 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
cur_saddr: 0x10a0a0a cur_daddr: 0x20b0a0a
cur_udp_dst: 9 cur_udp_src: 9
flows: 0
Result: OK: 13101142(c12220741+d880401) usec, 10000000 (60byte,0frags)
763292pps 390Mb/sec (390805504bps) errors: 39664
Configuring threads and devices
================================
This is done via the /proc interface easiest done via pgset in the scripts
Examples:
pgset "clone_skb 1" sets the number of copies of the same packet
pgset "clone_skb 0" use single SKB for all transmits
pgset "burst 8" uses xmit_more API to queue 8 copies of the same
packet and update HW tx queue tail pointer once.
"burst 1" is the default
pgset "pkt_size 9014" sets packet size to 9014
pgset "frags 5" packet will consist of 5 fragments
pgset "count 200000" sets number of packets to send, set to zero
for continuous sends until explicitly stopped.
pgset "delay 5000" adds delay to hard_start_xmit(). nanoseconds
pgset "dst 10.0.0.1" sets IP destination address
(BEWARE! This generator is very aggressive!)
pgset "dst_min 10.0.0.1" Same as dst
pgset "dst_max 10.0.0.254" Set the maximum destination IP.
pgset "src_min 10.0.0.1" Set the minimum (or only) source IP.
pgset "src_max 10.0.0.254" Set the maximum source IP.
pgset "dst6 fec0::1" IPV6 destination address
pgset "src6 fec0::2" IPV6 source address
pgset "dstmac 00:00:00:00:00:00" sets MAC destination address
pgset "srcmac 00:00:00:00:00:00" sets MAC source address
pgset "queue_map_min 0" Sets the min value of tx queue interval
pgset "queue_map_max 7" Sets the max value of tx queue interval, for multiqueue devices
To select queue 1 of a given device,
use queue_map_min=1 and queue_map_max=1
pgset "src_mac_count 1" Sets the number of MACs we'll range through.
The 'minimum' MAC is what you set with srcmac.
pgset "dst_mac_count 1" Sets the number of MACs we'll range through.
The 'minimum' MAC is what you set with dstmac.
pgset "flag [name]" Set a flag to determine behaviour. Current flags
are: IPSRC_RND # IP source is random (between min/max)
IPDST_RND # IP destination is random
UDPSRC_RND, UDPDST_RND,
MACSRC_RND, MACDST_RND
TXSIZE_RND, IPV6,
MPLS_RND, VID_RND, SVID_RND
FLOW_SEQ,
QUEUE_MAP_RND # queue map random
QUEUE_MAP_CPU # queue map mirrors smp_processor_id()
UDPCSUM,
IPSEC # IPsec encapsulation (needs CONFIG_XFRM)
NODE_ALLOC # node specific memory allocation
pgset spi SPI_VALUE Set specific SA used to transform packet.
pgset "udp_src_min 9" set UDP source port min, If < udp_src_max, then
cycle through the port range.
pgset "udp_src_max 9" set UDP source port max.
pgset "udp_dst_min 9" set UDP destination port min, If < udp_dst_max, then
cycle through the port range.
pgset "udp_dst_max 9" set UDP destination port max.
pgset "mpls 0001000a,0002000a,0000000a" set MPLS labels (in this example
outer label=16,middle label=32,
inner label=0 (IPv4 NULL)) Note that
there must be no spaces between the
arguments. Leading zeros are required.
Do not set the bottom of stack bit,
that's done automatically. If you do
set the bottom of stack bit, that
indicates that you want to randomly
generate that address and the flag
MPLS_RND will be turned on. You
can have any mix of random and fixed
labels in the label stack.
pgset "mpls 0" turn off mpls (or any invalid argument works too!)
pgset "vlan_id 77" set VLAN ID 0-4095
pgset "vlan_p 3" set priority bit 0-7 (default 0)
pgset "vlan_cfi 0" set canonical format identifier 0-1 (default 0)
pgset "svlan_id 22" set SVLAN ID 0-4095
pgset "svlan_p 3" set priority bit 0-7 (default 0)
pgset "svlan_cfi 0" set canonical format identifier 0-1 (default 0)
pgset "vlan_id 9999" > 4095 remove vlan and svlan tags
pgset "svlan 9999" > 4095 remove svlan tag
pgset "tos XX" set former IPv4 TOS field (e.g. "tos 28" for AF11 no ECN, default 00)
pgset "traffic_class XX" set former IPv6 TRAFFIC CLASS (e.g. "traffic_class B8" for EF no ECN, default 00)
pgset stop aborts injection. Also, ^C aborts generator.
pgset "rate 300M" set rate to 300 Mb/s
pgset "ratep 1000000" set rate to 1Mpps
Example scripts
===============
A collection of small tutorial scripts for pktgen is in examples dir.
pktgen.conf-1-1 # 1 CPU 1 dev
pktgen.conf-1-2 # 1 CPU 2 dev
pktgen.conf-2-1 # 2 CPU's 1 dev
pktgen.conf-2-2 # 2 CPU's 2 dev
pktgen.conf-1-1-rdos # 1 CPU 1 dev w. route DoS
pktgen.conf-1-1-ip6 # 1 CPU 1 dev ipv6
pktgen.conf-1-1-ip6-rdos # 1 CPU 1 dev ipv6 w. route DoS
pktgen.conf-1-1-flows # 1 CPU 1 dev multiple flows.
Run in shell: ./pktgen.conf-X-Y It does all the setup including sending.
Interrupt affinity
===================
Note when adding devices to a specific CPU there good idea to also assign
/proc/irq/XX/smp_affinity so the TX-interrupts gets bound to the same CPU.
as this reduces cache bouncing when freeing skb's.
Enable IPsec
============
Default IPsec transformation with ESP encapsulation plus Transport mode
could be enabled by simply setting:
pgset "flag IPSEC"
pgset "flows 1"
To avoid breaking existing testbed scripts for using AH type and tunnel mode,
user could use "pgset spi SPI_VALUE" to specify which formal of transformation
to employ.
Current commands and configuration options
==========================================
** Pgcontrol commands:
start
stop
** Thread commands:
add_device
rem_device_all
max_before_softirq
** Device commands:
count
clone_skb
debug
frags
delay
src_mac_count
dst_mac_count
pkt_size
min_pkt_size
max_pkt_size
mpls
udp_src_min
udp_src_max
udp_dst_min
udp_dst_max
flag
IPSRC_RND
IPDST_RND
UDPSRC_RND
UDPDST_RND
MACSRC_RND
MACDST_RND
TXSIZE_RND
IPV6
MPLS_RND
VID_RND
SVID_RND
FLOW_SEQ
QUEUE_MAP_RND
QUEUE_MAP_CPU
UDPCSUM
IPSEC
NODE_ALLOC
dst_min
dst_max
src_min
src_max
dst_mac
src_mac
clear_counters
dst6
src6
flows
flowlen
rate
ratep
References:
ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/
ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/
Paper from Linux-Kongress in Erlangen 2004.
ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/pktgen_paper.pdf
Thanks to:
Grant Grundler for testing on IA-64 and parisc, Harald Welte, Lennert Buytenhek
Stephen Hemminger, Andi Kleen, Dave Miller and many others.
Good luck with the linux net-development.

View file

@ -0,0 +1,150 @@
Classes
-------
"Class" is a complete routing table in common sense.
I.e. it is tree of nodes (destination prefix, tos, metric)
with attached information: gateway, device etc.
This tree is looked up as specified in RFC1812 5.2.4.3
1. Basic match
2. Longest match
3. Weak TOS.
4. Metric. (should not be in kernel space, but they are)
5. Additional pruning rules. (not in kernel space).
We have two special type of nodes:
REJECT - abort route lookup and return an error value.
THROW - abort route lookup in this class.
Currently the number of classes is limited to 255
(0 is reserved for "not specified class")
Three classes are builtin:
RT_CLASS_LOCAL=255 - local interface addresses,
broadcasts, nat addresses.
RT_CLASS_MAIN=254 - all normal routes are put there
by default.
RT_CLASS_DEFAULT=253 - if ip_fib_model==1, then
normal default routes are put there, if ip_fib_model==2
all gateway routes are put there.
Rules
-----
Rule is a record of (src prefix, src interface, tos, dst prefix)
with attached information.
Rule types:
RTP_ROUTE - lookup in attached class
RTP_NAT - lookup in attached class and if a match is found,
translate packet source address.
RTP_MASQUERADE - lookup in attached class and if a match is found,
masquerade packet as sourced by us.
RTP_DROP - silently drop the packet.
RTP_REJECT - drop the packet and send ICMP NET UNREACHABLE.
RTP_PROHIBIT - drop the packet and send ICMP COMM. ADM. PROHIBITED.
Rule flags:
RTRF_LOG - log route creations.
RTRF_VALVE - One way route (used with masquerading)
Default setup:
root@amber:/pub/ip-routing # iproute -r
Kernel routing policy rules
Pref Source Destination TOS Iface Cl
0 default default 00 * 255
254 default default 00 * 254
255 default default 00 * 253
Lookup algorithm
----------------
We scan rules list, and if a rule is matched, apply it.
If a route is found, return it.
If it is not found or a THROW node was matched, continue
to scan rules.
Applications
------------
1. Just ignore classes. All the routes are put into MAIN class
(and/or into DEFAULT class).
HOWTO: iproute add PREFIX [ tos TOS ] [ gw GW ] [ dev DEV ]
[ metric METRIC ] [ reject ] ... (look at iproute utility)
or use route utility from current net-tools.
2. Opposite case. Just forget all that you know about routing
tables. Every rule is supplied with its own gateway, device
info. record. This approach is not appropriate for automated
route maintenance, but it is ideal for manual configuration.
HOWTO: iproute addrule [ from PREFIX ] [ to PREFIX ] [ tos TOS ]
[ dev INPUTDEV] [ pref PREFERENCE ] route [ gw GATEWAY ]
[ dev OUTDEV ] .....
Warning: As of now the size of the routing table in this
approach is limited to 256. If someone likes this model, I'll
relax this limitation.
3. OSPF classes (see RFC1583, RFC1812 E.3.3)
Very clean, stable and robust algorithm for OSPF routing
domains. Unfortunately, it is not widely used in the Internet.
Proposed setup:
255 local addresses
254 interface routes
253 ASE routes with external metric
252 ASE routes with internal metric
251 inter-area routes
250 intra-area routes for 1st area
249 intra-area routes for 2nd area
etc.
Rules:
iproute addrule class 253
iproute addrule class 252
iproute addrule class 251
iproute addrule to a-prefix-for-1st-area class 250
iproute addrule to another-prefix-for-1st-area class 250
...
iproute addrule to a-prefix-for-2nd-area class 249
...
Area classes must be terminated with reject record.
iproute add default reject class 250
iproute add default reject class 249
...
4. The Variant Router Requirements Algorithm (RFC1812 E.3.2)
Create 16 classes for different TOS values.
It is a funny, but pretty useless algorithm.
I listed it just to show the power of new routing code.
5. All the variety of combinations......
GATED
-----
Gated does not understand classes, but it will work
happily in MAIN+DEFAULT. All policy routes can be set
and maintained manually.
IMPORTANT NOTE
--------------
route.c has a compilation time switch CONFIG_IP_LOCAL_RT_POLICY.
If it is set, locally originated packets are routed
using all the policy list. This is not very convenient and
pretty ambiguous when used with NAT and masquerading.
I set it to FALSE by default.
Alexey Kuznetov
kuznet@ms2.inr.ac.ru

View file

@ -0,0 +1,432 @@
PPP Generic Driver and Channel Interface
----------------------------------------
Paul Mackerras
paulus@samba.org
7 Feb 2002
The generic PPP driver in linux-2.4 provides an implementation of the
functionality which is of use in any PPP implementation, including:
* the network interface unit (ppp0 etc.)
* the interface to the networking code
* PPP multilink: splitting datagrams between multiple links, and
ordering and combining received fragments
* the interface to pppd, via a /dev/ppp character device
* packet compression and decompression
* TCP/IP header compression and decompression
* detecting network traffic for demand dialling and for idle timeouts
* simple packet filtering
For sending and receiving PPP frames, the generic PPP driver calls on
the services of PPP `channels'. A PPP channel encapsulates a
mechanism for transporting PPP frames from one machine to another. A
PPP channel implementation can be arbitrarily complex internally but
has a very simple interface with the generic PPP code: it merely has
to be able to send PPP frames, receive PPP frames, and optionally
handle ioctl requests. Currently there are PPP channel
implementations for asynchronous serial ports, synchronous serial
ports, and for PPP over ethernet.
This architecture makes it possible to implement PPP multilink in a
natural and straightforward way, by allowing more than one channel to
be linked to each ppp network interface unit. The generic layer is
responsible for splitting datagrams on transmit and recombining them
on receive.
PPP channel API
---------------
See include/linux/ppp_channel.h for the declaration of the types and
functions used to communicate between the generic PPP layer and PPP
channels.
Each channel has to provide two functions to the generic PPP layer,
via the ppp_channel.ops pointer:
* start_xmit() is called by the generic layer when it has a frame to
send. The channel has the option of rejecting the frame for
flow-control reasons. In this case, start_xmit() should return 0
and the channel should call the ppp_output_wakeup() function at a
later time when it can accept frames again, and the generic layer
will then attempt to retransmit the rejected frame(s). If the frame
is accepted, the start_xmit() function should return 1.
* ioctl() provides an interface which can be used by a user-space
program to control aspects of the channel's behaviour. This
procedure will be called when a user-space program does an ioctl
system call on an instance of /dev/ppp which is bound to the
channel. (Usually it would only be pppd which would do this.)
The generic PPP layer provides seven functions to channels:
* ppp_register_channel() is called when a channel has been created, to
notify the PPP generic layer of its presence. For example, setting
a serial port to the PPPDISC line discipline causes the ppp_async
channel code to call this function.
* ppp_unregister_channel() is called when a channel is to be
destroyed. For example, the ppp_async channel code calls this when
a hangup is detected on the serial port.
* ppp_output_wakeup() is called by a channel when it has previously
rejected a call to its start_xmit function, and can now accept more
packets.
* ppp_input() is called by a channel when it has received a complete
PPP frame.
* ppp_input_error() is called by a channel when it has detected that a
frame has been lost or dropped (for example, because of a FCS (frame
check sequence) error).
* ppp_channel_index() returns the channel index assigned by the PPP
generic layer to this channel. The channel should provide some way
(e.g. an ioctl) to transmit this back to user-space, as user-space
will need it to attach an instance of /dev/ppp to this channel.
* ppp_unit_number() returns the unit number of the ppp network
interface to which this channel is connected, or -1 if the channel
is not connected.
Connecting a channel to the ppp generic layer is initiated from the
channel code, rather than from the generic layer. The channel is
expected to have some way for a user-level process to control it
independently of the ppp generic layer. For example, with the
ppp_async channel, this is provided by the file descriptor to the
serial port.
Generally a user-level process will initialize the underlying
communications medium and prepare it to do PPP. For example, with an
async tty, this can involve setting the tty speed and modes, issuing
modem commands, and then going through some sort of dialog with the
remote system to invoke PPP service there. We refer to this process
as `discovery'. Then the user-level process tells the medium to
become a PPP channel and register itself with the generic PPP layer.
The channel then has to report the channel number assigned to it back
to the user-level process. From that point, the PPP negotiation code
in the PPP daemon (pppd) can take over and perform the PPP
negotiation, accessing the channel through the /dev/ppp interface.
At the interface to the PPP generic layer, PPP frames are stored in
skbuff structures and start with the two-byte PPP protocol number.
The frame does *not* include the 0xff `address' byte or the 0x03
`control' byte that are optionally used in async PPP. Nor is there
any escaping of control characters, nor are there any FCS or framing
characters included. That is all the responsibility of the channel
code, if it is needed for the particular medium. That is, the skbuffs
presented to the start_xmit() function contain only the 2-byte
protocol number and the data, and the skbuffs presented to ppp_input()
must be in the same format.
The channel must provide an instance of a ppp_channel struct to
represent the channel. The channel is free to use the `private' field
however it wishes. The channel should initialize the `mtu' and
`hdrlen' fields before calling ppp_register_channel() and not change
them until after ppp_unregister_channel() returns. The `mtu' field
represents the maximum size of the data part of the PPP frames, that
is, it does not include the 2-byte protocol number.
If the channel needs some headroom in the skbuffs presented to it for
transmission (i.e., some space free in the skbuff data area before the
start of the PPP frame), it should set the `hdrlen' field of the
ppp_channel struct to the amount of headroom required. The generic
PPP layer will attempt to provide that much headroom but the channel
should still check if there is sufficient headroom and copy the skbuff
if there isn't.
On the input side, channels should ideally provide at least 2 bytes of
headroom in the skbuffs presented to ppp_input(). The generic PPP
code does not require this but will be more efficient if this is done.
Buffering and flow control
--------------------------
The generic PPP layer has been designed to minimize the amount of data
that it buffers in the transmit direction. It maintains a queue of
transmit packets for the PPP unit (network interface device) plus a
queue of transmit packets for each attached channel. Normally the
transmit queue for the unit will contain at most one packet; the
exceptions are when pppd sends packets by writing to /dev/ppp, and
when the core networking code calls the generic layer's start_xmit()
function with the queue stopped, i.e. when the generic layer has
called netif_stop_queue(), which only happens on a transmit timeout.
The start_xmit function always accepts and queues the packet which it
is asked to transmit.
Transmit packets are dequeued from the PPP unit transmit queue and
then subjected to TCP/IP header compression and packet compression
(Deflate or BSD-Compress compression), as appropriate. After this
point the packets can no longer be reordered, as the decompression
algorithms rely on receiving compressed packets in the same order that
they were generated.
If multilink is not in use, this packet is then passed to the attached
channel's start_xmit() function. If the channel refuses to take
the packet, the generic layer saves it for later transmission. The
generic layer will call the channel's start_xmit() function again
when the channel calls ppp_output_wakeup() or when the core
networking code calls the generic layer's start_xmit() function
again. The generic layer contains no timeout and retransmission
logic; it relies on the core networking code for that.
If multilink is in use, the generic layer divides the packet into one
or more fragments and puts a multilink header on each fragment. It
decides how many fragments to use based on the length of the packet
and the number of channels which are potentially able to accept a
fragment at the moment. A channel is potentially able to accept a
fragment if it doesn't have any fragments currently queued up for it
to transmit. The channel may still refuse a fragment; in this case
the fragment is queued up for the channel to transmit later. This
scheme has the effect that more fragments are given to higher-
bandwidth channels. It also means that under light load, the generic
layer will tend to fragment large packets across all the channels,
thus reducing latency, while under heavy load, packets will tend to be
transmitted as single fragments, thus reducing the overhead of
fragmentation.
SMP safety
----------
The PPP generic layer has been designed to be SMP-safe. Locks are
used around accesses to the internal data structures where necessary
to ensure their integrity. As part of this, the generic layer
requires that the channels adhere to certain requirements and in turn
provides certain guarantees to the channels. Essentially the channels
are required to provide the appropriate locking on the ppp_channel
structures that form the basis of the communication between the
channel and the generic layer. This is because the channel provides
the storage for the ppp_channel structure, and so the channel is
required to provide the guarantee that this storage exists and is
valid at the appropriate times.
The generic layer requires these guarantees from the channel:
* The ppp_channel object must exist from the time that
ppp_register_channel() is called until after the call to
ppp_unregister_channel() returns.
* No thread may be in a call to any of ppp_input(), ppp_input_error(),
ppp_output_wakeup(), ppp_channel_index() or ppp_unit_number() for a
channel at the time that ppp_unregister_channel() is called for that
channel.
* ppp_register_channel() and ppp_unregister_channel() must be called
from process context, not interrupt or softirq/BH context.
* The remaining generic layer functions may be called at softirq/BH
level but must not be called from a hardware interrupt handler.
* The generic layer may call the channel start_xmit() function at
softirq/BH level but will not call it at interrupt level. Thus the
start_xmit() function may not block.
* The generic layer will only call the channel ioctl() function in
process context.
The generic layer provides these guarantees to the channels:
* The generic layer will not call the start_xmit() function for a
channel while any thread is already executing in that function for
that channel.
* The generic layer will not call the ioctl() function for a channel
while any thread is already executing in that function for that
channel.
* By the time a call to ppp_unregister_channel() returns, no thread
will be executing in a call from the generic layer to that channel's
start_xmit() or ioctl() function, and the generic layer will not
call either of those functions subsequently.
Interface to pppd
-----------------
The PPP generic layer exports a character device interface called
/dev/ppp. This is used by pppd to control PPP interface units and
channels. Although there is only one /dev/ppp, each open instance of
/dev/ppp acts independently and can be attached either to a PPP unit
or a PPP channel. This is achieved using the file->private_data field
to point to a separate object for each open instance of /dev/ppp. In
this way an effect similar to Solaris' clone open is obtained,
allowing us to control an arbitrary number of PPP interfaces and
channels without having to fill up /dev with hundreds of device names.
When /dev/ppp is opened, a new instance is created which is initially
unattached. Using an ioctl call, it can then be attached to an
existing unit, attached to a newly-created unit, or attached to an
existing channel. An instance attached to a unit can be used to send
and receive PPP control frames, using the read() and write() system
calls, along with poll() if necessary. Similarly, an instance
attached to a channel can be used to send and receive PPP frames on
that channel.
In multilink terms, the unit represents the bundle, while the channels
represent the individual physical links. Thus, a PPP frame sent by a
write to the unit (i.e., to an instance of /dev/ppp attached to the
unit) will be subject to bundle-level compression and to fragmentation
across the individual links (if multilink is in use). In contrast, a
PPP frame sent by a write to the channel will be sent as-is on that
channel, without any multilink header.
A channel is not initially attached to any unit. In this state it can
be used for PPP negotiation but not for the transfer of data packets.
It can then be connected to a PPP unit with an ioctl call, which
makes it available to send and receive data packets for that unit.
The ioctl calls which are available on an instance of /dev/ppp depend
on whether it is unattached, attached to a PPP interface, or attached
to a PPP channel. The ioctl calls which are available on an
unattached instance are:
* PPPIOCNEWUNIT creates a new PPP interface and makes this /dev/ppp
instance the "owner" of the interface. The argument should point to
an int which is the desired unit number if >= 0, or -1 to assign the
lowest unused unit number. Being the owner of the interface means
that the interface will be shut down if this instance of /dev/ppp is
closed.
* PPPIOCATTACH attaches this instance to an existing PPP interface.
The argument should point to an int containing the unit number.
This does not make this instance the owner of the PPP interface.
* PPPIOCATTCHAN attaches this instance to an existing PPP channel.
The argument should point to an int containing the channel number.
The ioctl calls available on an instance of /dev/ppp attached to a
channel are:
* PPPIOCDETACH detaches the instance from the channel. This ioctl is
deprecated since the same effect can be achieved by closing the
instance. In order to prevent possible races this ioctl will fail
with an EINVAL error if more than one file descriptor refers to this
instance (i.e. as a result of dup(), dup2() or fork()).
* PPPIOCCONNECT connects this channel to a PPP interface. The
argument should point to an int containing the interface unit
number. It will return an EINVAL error if the channel is already
connected to an interface, or ENXIO if the requested interface does
not exist.
* PPPIOCDISCONN disconnects this channel from the PPP interface that
it is connected to. It will return an EINVAL error if the channel
is not connected to an interface.
* All other ioctl commands are passed to the channel ioctl() function.
The ioctl calls that are available on an instance that is attached to
an interface unit are:
* PPPIOCSMRU sets the MRU (maximum receive unit) for the interface.
The argument should point to an int containing the new MRU value.
* PPPIOCSFLAGS sets flags which control the operation of the
interface. The argument should be a pointer to an int containing
the new flags value. The bits in the flags value that can be set
are:
SC_COMP_TCP enable transmit TCP header compression
SC_NO_TCP_CCID disable connection-id compression for
TCP header compression
SC_REJ_COMP_TCP disable receive TCP header decompression
SC_CCP_OPEN Compression Control Protocol (CCP) is
open, so inspect CCP packets
SC_CCP_UP CCP is up, may (de)compress packets
SC_LOOP_TRAFFIC send IP traffic to pppd
SC_MULTILINK enable PPP multilink fragmentation on
transmitted packets
SC_MP_SHORTSEQ expect short multilink sequence
numbers on received multilink fragments
SC_MP_XSHORTSEQ transmit short multilink sequence nos.
The values of these flags are defined in <linux/ppp-ioctl.h>. Note
that the values of the SC_MULTILINK, SC_MP_SHORTSEQ and
SC_MP_XSHORTSEQ bits are ignored if the CONFIG_PPP_MULTILINK option
is not selected.
* PPPIOCGFLAGS returns the value of the status/control flags for the
interface unit. The argument should point to an int where the ioctl
will store the flags value. As well as the values listed above for
PPPIOCSFLAGS, the following bits may be set in the returned value:
SC_COMP_RUN CCP compressor is running
SC_DECOMP_RUN CCP decompressor is running
SC_DC_ERROR CCP decompressor detected non-fatal error
SC_DC_FERROR CCP decompressor detected fatal error
* PPPIOCSCOMPRESS sets the parameters for packet compression or
decompression. The argument should point to a ppp_option_data
structure (defined in <linux/ppp-ioctl.h>), which contains a
pointer/length pair which should describe a block of memory
containing a CCP option specifying a compression method and its
parameters. The ppp_option_data struct also contains a `transmit'
field. If this is 0, the ioctl will affect the receive path,
otherwise the transmit path.
* PPPIOCGUNIT returns, in the int pointed to by the argument, the unit
number of this interface unit.
* PPPIOCSDEBUG sets the debug flags for the interface to the value in
the int pointed to by the argument. Only the least significant bit
is used; if this is 1 the generic layer will print some debug
messages during its operation. This is only intended for debugging
the generic PPP layer code; it is generally not helpful for working
out why a PPP connection is failing.
* PPPIOCGDEBUG returns the debug flags for the interface in the int
pointed to by the argument.
* PPPIOCGIDLE returns the time, in seconds, since the last data
packets were sent and received. The argument should point to a
ppp_idle structure (defined in <linux/ppp_defs.h>). If the
CONFIG_PPP_FILTER option is enabled, the set of packets which reset
the transmit and receive idle timers is restricted to those which
pass the `active' packet filter.
* PPPIOCSMAXCID sets the maximum connection-ID parameter (and thus the
number of connection slots) for the TCP header compressor and
decompressor. The lower 16 bits of the int pointed to by the
argument specify the maximum connection-ID for the compressor. If
the upper 16 bits of that int are non-zero, they specify the maximum
connection-ID for the decompressor, otherwise the decompressor's
maximum connection-ID is set to 15.
* PPPIOCSNPMODE sets the network-protocol mode for a given network
protocol. The argument should point to an npioctl struct (defined
in <linux/ppp-ioctl.h>). The `protocol' field gives the PPP protocol
number for the protocol to be affected, and the `mode' field
specifies what to do with packets for that protocol:
NPMODE_PASS normal operation, transmit and receive packets
NPMODE_DROP silently drop packets for this protocol
NPMODE_ERROR drop packets and return an error on transmit
NPMODE_QUEUE queue up packets for transmit, drop received
packets
At present NPMODE_ERROR and NPMODE_QUEUE have the same effect as
NPMODE_DROP.
* PPPIOCGNPMODE returns the network-protocol mode for a given
protocol. The argument should point to an npioctl struct with the
`protocol' field set to the PPP protocol number for the protocol of
interest. On return the `mode' field will be set to the network-
protocol mode for that protocol.
* PPPIOCSPASS and PPPIOCSACTIVE set the `pass' and `active' packet
filters. These ioctls are only available if the CONFIG_PPP_FILTER
option is selected. The argument should point to a sock_fprog
structure (defined in <linux/filter.h>) containing the compiled BPF
instructions for the filter. Packets are dropped if they fail the
`pass' filter; otherwise, if they fail the `active' filter they are
passed but they do not reset the transmit or receive idle timer.
* PPPIOCSMRRU enables or disables multilink processing for received
packets and sets the multilink MRRU (maximum reconstructed receive
unit). The argument should point to an int containing the new MRRU
value. If the MRRU value is 0, processing of received multilink
fragments is disabled. This ioctl is only available if the
CONFIG_PPP_MULTILINK option is selected.
Last modified: 7-feb-2002

View file

@ -0,0 +1,48 @@
This document describes the interfaces /proc/net/tcp and /proc/net/tcp6.
Note that these interfaces are deprecated in favor of tcp_diag.
These /proc interfaces provide information about currently active TCP
connections, and are implemented by tcp4_seq_show() in net/ipv4/tcp_ipv4.c
and tcp6_seq_show() in net/ipv6/tcp_ipv6.c, respectively.
It will first list all listening TCP sockets, and next list all established
TCP connections. A typical entry of /proc/net/tcp would look like this (split
up into 3 parts because of the length of the line):
46: 010310AC:9C4C 030310AC:1770 01
| | | | | |--> connection state
| | | | |------> remote TCP port number
| | | |-------------> remote IPv4 address
| | |--------------------> local TCP port number
| |---------------------------> local IPv4 address
|----------------------------------> number of entry
00000150:00000000 01:00000019 00000000
| | | | |--> number of unrecovered RTO timeouts
| | | |----------> number of jiffies until timer expires
| | |----------------> timer_active (see below)
| |----------------------> receive-queue
|-------------------------------> transmit-queue
1000 0 54165785 4 cd1e6040 25 4 27 3 -1
| | | | | | | | | |--> slow start size threshold,
| | | | | | | | | or -1 if the threshold
| | | | | | | | | is >= 0xFFFF
| | | | | | | | |----> sending congestion window
| | | | | | | |-------> (ack.quick<<1)|ack.pingpong
| | | | | | |---------> Predicted tick of soft clock
| | | | | | (delayed ACK control data)
| | | | | |------------> retransmit timeout
| | | | |------------------> location of socket in memory
| | | |-----------------------> socket reference count
| | |-----------------------------> inode
| |----------------------------------> unanswered 0-window probes
|---------------------------------------------> uid
timer_active:
0 no timer is pending
1 retransmit-timer is pending
2 another timer (e.g. delayed ack or keepalive) is pending
3 this is a socket in TIME_WAIT state. Not all fields will contain
data (or even exist)
4 zero window probe timer is pending

View file

@ -0,0 +1,152 @@
How to use radiotap headers
===========================
Pointer to the radiotap include file
------------------------------------
Radiotap headers are variable-length and extensible, you can get most of the
information you need to know on them from:
./include/net/ieee80211_radiotap.h
This document gives an overview and warns on some corner cases.
Structure of the header
-----------------------
There is a fixed portion at the start which contains a u32 bitmap that defines
if the possible argument associated with that bit is present or not. So if b0
of the it_present member of ieee80211_radiotap_header is set, it means that
the header for argument index 0 (IEEE80211_RADIOTAP_TSFT) is present in the
argument area.
< 8-byte ieee80211_radiotap_header >
[ <possible argument bitmap extensions ... > ]
[ <argument> ... ]
At the moment there are only 13 possible argument indexes defined, but in case
we run out of space in the u32 it_present member, it is defined that b31 set
indicates that there is another u32 bitmap following (shown as "possible
argument bitmap extensions..." above), and the start of the arguments is moved
forward 4 bytes each time.
Note also that the it_len member __le16 is set to the total number of bytes
covered by the ieee80211_radiotap_header and any arguments following.
Requirements for arguments
--------------------------
After the fixed part of the header, the arguments follow for each argument
index whose matching bit is set in the it_present member of
ieee80211_radiotap_header.
- the arguments are all stored little-endian!
- the argument payload for a given argument index has a fixed size. So
IEEE80211_RADIOTAP_TSFT being present always indicates an 8-byte argument is
present. See the comments in ./include/net/ieee80211_radiotap.h for a nice
breakdown of all the argument sizes
- the arguments must be aligned to a boundary of the argument size using
padding. So a u16 argument must start on the next u16 boundary if it isn't
already on one, a u32 must start on the next u32 boundary and so on.
- "alignment" is relative to the start of the ieee80211_radiotap_header, ie,
the first byte of the radiotap header. The absolute alignment of that first
byte isn't defined. So even if the whole radiotap header is starting at, eg,
address 0x00000003, still the first byte of the radiotap header is treated as
0 for alignment purposes.
- the above point that there may be no absolute alignment for multibyte
entities in the fixed radiotap header or the argument region means that you
have to take special evasive action when trying to access these multibyte
entities. Some arches like Blackfin cannot deal with an attempt to
dereference, eg, a u16 pointer that is pointing to an odd address. Instead
you have to use a kernel API get_unaligned() to dereference the pointer,
which will do it bytewise on the arches that require that.
- The arguments for a given argument index can be a compound of multiple types
together. For example IEEE80211_RADIOTAP_CHANNEL has an argument payload
consisting of two u16s of total length 4. When this happens, the padding
rule is applied dealing with a u16, NOT dealing with a 4-byte single entity.
Example valid radiotap header
-----------------------------
0x00, 0x00, // <-- radiotap version + pad byte
0x0b, 0x00, // <- radiotap header length
0x04, 0x0c, 0x00, 0x00, // <-- bitmap
0x6c, // <-- rate (in 500kHz units)
0x0c, //<-- tx power
0x01 //<-- antenna
Using the Radiotap Parser
-------------------------
If you are having to parse a radiotap struct, you can radically simplify the
job by using the radiotap parser that lives in net/wireless/radiotap.c and has
its prototypes available in include/net/cfg80211.h. You use it like this:
#include <net/cfg80211.h>
/* buf points to the start of the radiotap header part */
int MyFunction(u8 * buf, int buflen)
{
int pkt_rate_100kHz = 0, antenna = 0, pwr = 0;
struct ieee80211_radiotap_iterator iterator;
int ret = ieee80211_radiotap_iterator_init(&iterator, buf, buflen);
while (!ret) {
ret = ieee80211_radiotap_iterator_next(&iterator);
if (ret)
continue;
/* see if this argument is something we can use */
switch (iterator.this_arg_index) {
/*
* You must take care when dereferencing iterator.this_arg
* for multibyte types... the pointer is not aligned. Use
* get_unaligned((type *)iterator.this_arg) to dereference
* iterator.this_arg for type "type" safely on all arches.
*/
case IEEE80211_RADIOTAP_RATE:
/* radiotap "rate" u8 is in
* 500kbps units, eg, 0x02=1Mbps
*/
pkt_rate_100kHz = (*iterator.this_arg) * 5;
break;
case IEEE80211_RADIOTAP_ANTENNA:
/* radiotap uses 0 for 1st ant */
antenna = *iterator.this_arg);
break;
case IEEE80211_RADIOTAP_DBM_TX_POWER:
pwr = *iterator.this_arg;
break;
default:
break;
}
} /* while more rt headers */
if (ret != -ENOENT)
return TXRX_DROP;
/* discard the radiotap header part */
buf += iterator.max_length;
buflen -= iterator.max_length;
...
}
Andy Green <andy@warmcat.com>

View file

@ -0,0 +1,150 @@
September 21, 1999
Copyright (c) 1998 Corey Thomas (corey@world.std.com)
This file is the documentation for the Raylink Wireless LAN card driver for
Linux. The Raylink wireless LAN card is a PCMCIA card which provides IEEE
802.11 compatible wireless network connectivity at 1 and 2 megabits/second.
See http://www.raytheon.com/micro/raylink/ for more information on the Raylink
card. This driver is in early development and does have bugs. See the known
bugs and limitations at the end of this document for more information.
This driver also works with WebGear's Aviator 2.4 and Aviator Pro
wireless LAN cards.
As of kernel 2.3.18, the ray_cs driver is part of the Linux kernel
source. My web page for the development of ray_cs is at
http://web.ralinktech.com/ralink/Home/Support/Linux.html
and I can be emailed at corey@world.std.com
The kernel driver is based on ray_cs-1.62.tgz
The driver at my web page is intended to be used as an add on to
David Hinds pcmcia package. All the command line parameters are
available when compiled as a module. When built into the kernel, only
the essid= string parameter is available via the kernel command line.
This will change after the method of sorting out parameters for all
the PCMCIA drivers is agreed upon. If you must have a built in driver
with nondefault parameters, they can be edited in
/usr/src/linux/drivers/net/pcmcia/ray_cs.c. Searching for module_param
will find them all.
Information on card services is available at:
http://pcmcia-cs.sourceforge.net/
Card services user programs are still required for PCMCIA devices.
pcmcia-cs-3.1.1 or greater is required for the kernel version of
the driver.
Currently, ray_cs is not part of David Hinds card services package,
so the following magic is required.
At the end of the /etc/pcmcia/config.opts file, add the line:
source ./ray_cs.opts
This will make card services read the ray_cs.opts file
when starting. Create the file /etc/pcmcia/ray_cs.opts containing the
following:
#### start of /etc/pcmcia/ray_cs.opts ###################
# Configuration options for Raylink Wireless LAN PCMCIA card
device "ray_cs"
class "network" module "misc/ray_cs"
card "RayLink PC Card WLAN Adapter"
manfid 0x01a6, 0x0000
bind "ray_cs"
module "misc/ray_cs" opts ""
#### end of /etc/pcmcia/ray_cs.opts #####################
To join an existing network with
different parameters, contact the network administrator for the
configuration information, and edit /etc/pcmcia/ray_cs.opts.
Add the parameters below between the empty quotes.
Parameters for ray_cs driver which may be specified in ray_cs.opts:
bc integer 0 = normal mode (802.11 timing)
1 = slow down inter frame timing to allow
operation with older breezecom access
points.
beacon_period integer beacon period in Kilo-microseconds
legal values = must be integer multiple
of hop dwell
default = 256
country integer 1 = USA (default)
2 = Europe
3 = Japan
4 = Korea
5 = Spain
6 = France
7 = Israel
8 = Australia
essid string ESS ID - network name to join
string with maximum length of 32 chars
default value = "ADHOC_ESSID"
hop_dwell integer hop dwell time in Kilo-microseconds
legal values = 16,32,64,128(default),256
irq_mask integer linux standard 16 bit value 1bit/IRQ
lsb is IRQ 0, bit 1 is IRQ 1 etc.
Used to restrict choice of IRQ's to use.
Recommended method for controlling
interrupts is in /etc/pcmcia/config.opts
net_type integer 0 (default) = adhoc network,
1 = infrastructure
phy_addr string string containing new MAC address in
hex, must start with x eg
x00008f123456
psm integer 0 = continuously active
1 = power save mode (not useful yet)
pc_debug integer (0-5) larger values for more verbose
logging. Replaces ray_debug.
ray_debug integer Replaced with pc_debug
ray_mem_speed integer defaults to 500
sniffer integer 0 = not sniffer (default)
1 = sniffer which can be used to record all
network traffic using tcpdump or similar,
but no normal network use is allowed.
translate integer 0 = no translation (encapsulate frames)
1 = translation (RFC1042/802.1)
More on sniffer mode:
tcpdump does not understand 802.11 headers, so it can't
interpret the contents, but it can record to a file. This is only
useful for debugging 802.11 lowlevel protocols that are not visible to
linux. If you want to watch ftp xfers, or do similar things, you
don't need to use sniffer mode. Also, some packet types are never
sent up by the card, so you will never see them (ack, rts, cts, probe
etc.) There is a simple program (showcap) included in the ray_cs
package which parses the 802.11 headers.
Known Problems and missing features
Does not work with non x86
Does not work with SMP
Support for defragmenting frames is not yet debugged, and in
fact is known to not work. I have never encountered a net set
up to fragment, but still, it should be fixed.
The ioctl support is incomplete. The hardware address cannot be set
using ifconfig yet. If a different hardware address is needed, it may
be set using the phy_addr parameter in ray_cs.opts. This requires
a card insertion to take effect.

View file

@ -0,0 +1,356 @@
Overview
========
This readme tries to provide some background on the hows and whys of RDS,
and will hopefully help you find your way around the code.
In addition, please see this email about RDS origins:
http://oss.oracle.com/pipermail/rds-devel/2007-November/000228.html
RDS Architecture
================
RDS provides reliable, ordered datagram delivery by using a single
reliable connection between any two nodes in the cluster. This allows
applications to use a single socket to talk to any other process in the
cluster - so in a cluster with N processes you need N sockets, in contrast
to N*N if you use a connection-oriented socket transport like TCP.
RDS is not Infiniband-specific; it was designed to support different
transports. The current implementation used to support RDS over TCP as well
as IB. Work is in progress to support RDS over iWARP, and using DCE to
guarantee no dropped packets on Ethernet, it may be possible to use RDS over
UDP in the future.
The high-level semantics of RDS from the application's point of view are
* Addressing
RDS uses IPv4 addresses and 16bit port numbers to identify
the end point of a connection. All socket operations that involve
passing addresses between kernel and user space generally
use a struct sockaddr_in.
The fact that IPv4 addresses are used does not mean the underlying
transport has to be IP-based. In fact, RDS over IB uses a
reliable IB connection; the IP address is used exclusively to
locate the remote node's GID (by ARPing for the given IP).
The port space is entirely independent of UDP, TCP or any other
protocol.
* Socket interface
RDS sockets work *mostly* as you would expect from a BSD
socket. The next section will cover the details. At any rate,
all I/O is performed through the standard BSD socket API.
Some additions like zerocopy support are implemented through
control messages, while other extensions use the getsockopt/
setsockopt calls.
Sockets must be bound before you can send or receive data.
This is needed because binding also selects a transport and
attaches it to the socket. Once bound, the transport assignment
does not change. RDS will tolerate IPs moving around (eg in
a active-active HA scenario), but only as long as the address
doesn't move to a different transport.
* sysctls
RDS supports a number of sysctls in /proc/sys/net/rds
Socket Interface
================
AF_RDS, PF_RDS, SOL_RDS
These constants haven't been assigned yet, because RDS isn't in
mainline yet. Currently, the kernel module assigns some constant
and publishes it to user space through two sysctl files
/proc/sys/net/rds/pf_rds
/proc/sys/net/rds/sol_rds
fd = socket(PF_RDS, SOCK_SEQPACKET, 0);
This creates a new, unbound RDS socket.
setsockopt(SOL_SOCKET): send and receive buffer size
RDS honors the send and receive buffer size socket options.
You are not allowed to queue more than SO_SNDSIZE bytes to
a socket. A message is queued when sendmsg is called, and
it leaves the queue when the remote system acknowledges
its arrival.
The SO_RCVSIZE option controls the maximum receive queue length.
This is a soft limit rather than a hard limit - RDS will
continue to accept and queue incoming messages, even if that
takes the queue length over the limit. However, it will also
mark the port as "congested" and send a congestion update to
the source node. The source node is supposed to throttle any
processes sending to this congested port.
bind(fd, &sockaddr_in, ...)
This binds the socket to a local IP address and port, and a
transport.
sendmsg(fd, ...)
Sends a message to the indicated recipient. The kernel will
transparently establish the underlying reliable connection
if it isn't up yet.
An attempt to send a message that exceeds SO_SNDSIZE will
return with -EMSGSIZE
An attempt to send a message that would take the total number
of queued bytes over the SO_SNDSIZE threshold will return
EAGAIN.
An attempt to send a message to a destination that is marked
as "congested" will return ENOBUFS.
recvmsg(fd, ...)
Receives a message that was queued to this socket. The sockets
recv queue accounting is adjusted, and if the queue length
drops below SO_SNDSIZE, the port is marked uncongested, and
a congestion update is sent to all peers.
Applications can ask the RDS kernel module to receive
notifications via control messages (for instance, there is a
notification when a congestion update arrived, or when a RDMA
operation completes). These notifications are received through
the msg.msg_control buffer of struct msghdr. The format of the
messages is described in manpages.
poll(fd)
RDS supports the poll interface to allow the application
to implement async I/O.
POLLIN handling is pretty straightforward. When there's an
incoming message queued to the socket, or a pending notification,
we signal POLLIN.
POLLOUT is a little harder. Since you can essentially send
to any destination, RDS will always signal POLLOUT as long as
there's room on the send queue (ie the number of bytes queued
is less than the sendbuf size).
However, the kernel will refuse to accept messages to
a destination marked congested - in this case you will loop
forever if you rely on poll to tell you what to do.
This isn't a trivial problem, but applications can deal with
this - by using congestion notifications, and by checking for
ENOBUFS errors returned by sendmsg.
setsockopt(SOL_RDS, RDS_CANCEL_SENT_TO, &sockaddr_in)
This allows the application to discard all messages queued to a
specific destination on this particular socket.
This allows the application to cancel outstanding messages if
it detects a timeout. For instance, if it tried to send a message,
and the remote host is unreachable, RDS will keep trying forever.
The application may decide it's not worth it, and cancel the
operation. In this case, it would use RDS_CANCEL_SENT_TO to
nuke any pending messages.
RDMA for RDS
============
see rds-rdma(7) manpage (available in rds-tools)
Congestion Notifications
========================
see rds(7) manpage
RDS Protocol
============
Message header
The message header is a 'struct rds_header' (see rds.h):
Fields:
h_sequence:
per-packet sequence number
h_ack:
piggybacked acknowledgment of last packet received
h_len:
length of data, not including header
h_sport:
source port
h_dport:
destination port
h_flags:
CONG_BITMAP - this is a congestion update bitmap
ACK_REQUIRED - receiver must ack this packet
RETRANSMITTED - packet has previously been sent
h_credit:
indicate to other end of connection that
it has more credits available (i.e. there is
more send room)
h_padding[4]:
unused, for future use
h_csum:
header checksum
h_exthdr:
optional data can be passed here. This is currently used for
passing RDMA-related information.
ACK and retransmit handling
One might think that with reliable IB connections you wouldn't need
to ack messages that have been received. The problem is that IB
hardware generates an ack message before it has DMAed the message
into memory. This creates a potential message loss if the HCA is
disabled for any reason between when it sends the ack and before
the message is DMAed and processed. This is only a potential issue
if another HCA is available for fail-over.
Sending an ack immediately would allow the sender to free the sent
message from their send queue quickly, but could cause excessive
traffic to be used for acks. RDS piggybacks acks on sent data
packets. Ack-only packets are reduced by only allowing one to be
in flight at a time, and by the sender only asking for acks when
its send buffers start to fill up. All retransmissions are also
acked.
Flow Control
RDS's IB transport uses a credit-based mechanism to verify that
there is space in the peer's receive buffers for more data. This
eliminates the need for hardware retries on the connection.
Congestion
Messages waiting in the receive queue on the receiving socket
are accounted against the sockets SO_RCVBUF option value. Only
the payload bytes in the message are accounted for. If the
number of bytes queued equals or exceeds rcvbuf then the socket
is congested. All sends attempted to this socket's address
should return block or return -EWOULDBLOCK.
Applications are expected to be reasonably tuned such that this
situation very rarely occurs. An application encountering this
"back-pressure" is considered a bug.
This is implemented by having each node maintain bitmaps which
indicate which ports on bound addresses are congested. As the
bitmap changes it is sent through all the connections which
terminate in the local address of the bitmap which changed.
The bitmaps are allocated as connections are brought up. This
avoids allocation in the interrupt handling path which queues
sages on sockets. The dense bitmaps let transports send the
entire bitmap on any bitmap change reasonably efficiently. This
is much easier to implement than some finer-grained
communication of per-port congestion. The sender does a very
inexpensive bit test to test if the port it's about to send to
is congested or not.
RDS Transport Layer
==================
As mentioned above, RDS is not IB-specific. Its code is divided
into a general RDS layer and a transport layer.
The general layer handles the socket API, congestion handling,
loopback, stats, usermem pinning, and the connection state machine.
The transport layer handles the details of the transport. The IB
transport, for example, handles all the queue pairs, work requests,
CM event handlers, and other Infiniband details.
RDS Kernel Structures
=====================
struct rds_message
aka possibly "rds_outgoing", the generic RDS layer copies data to
be sent and sets header fields as needed, based on the socket API.
This is then queued for the individual connection and sent by the
connection's transport.
struct rds_incoming
a generic struct referring to incoming data that can be handed from
the transport to the general code and queued by the general code
while the socket is awoken. It is then passed back to the transport
code to handle the actual copy-to-user.
struct rds_socket
per-socket information
struct rds_connection
per-connection information
struct rds_transport
pointers to transport-specific functions
struct rds_statistics
non-transport-specific statistics
struct rds_cong_map
wraps the raw congestion bitmap, contains rbnode, waitq, etc.
Connection management
=====================
Connections may be in UP, DOWN, CONNECTING, DISCONNECTING, and
ERROR states.
The first time an attempt is made by an RDS socket to send data to
a node, a connection is allocated and connected. That connection is
then maintained forever -- if there are transport errors, the
connection will be dropped and re-established.
Dropping a connection while packets are queued will cause queued or
partially-sent datagrams to be retransmitted when the connection is
re-established.
The send path
=============
rds_sendmsg()
struct rds_message built from incoming data
CMSGs parsed (e.g. RDMA ops)
transport connection alloced and connected if not already
rds_message placed on send queue
send worker awoken
rds_send_worker()
calls rds_send_xmit() until queue is empty
rds_send_xmit()
transmits congestion map if one is pending
may set ACK_REQUIRED
calls transport to send either non-RDMA or RDMA message
(RDMA ops never retransmitted)
rds_ib_xmit()
allocs work requests from send ring
adds any new send credits available to peer (h_credits)
maps the rds_message's sg list
piggybacks ack
populates work requests
post send to connection's queue pair
The recv path
=============
rds_ib_recv_cq_comp_handler()
looks at write completions
unmaps recv buffer from device
no errors, call rds_ib_process_recv()
refill recv ring
rds_ib_process_recv()
validate header checksum
copy header to rds_ib_incoming struct if start of a new datagram
add to ibinc's fraglist
if competed datagram:
update cong map if datagram was cong update
call rds_recv_incoming() otherwise
note if ack is required
rds_recv_incoming()
drop duplicate packets
respond to pings
find the sock associated with this datagram
add to sock queue
wake up sock
do some congestion calculations
rds_recvmsg
copy data into user iovec
handle CMSGs
return to application

View file

@ -0,0 +1,214 @@
Linux wireless regulatory documentation
---------------------------------------
This document gives a brief review over how the Linux wireless
regulatory infrastructure works.
More up to date information can be obtained at the project's web page:
http://wireless.kernel.org/en/developers/Regulatory
Keeping regulatory domains in userspace
---------------------------------------
Due to the dynamic nature of regulatory domains we keep them
in userspace and provide a framework for userspace to upload
to the kernel one regulatory domain to be used as the central
core regulatory domain all wireless devices should adhere to.
How to get regulatory domains to the kernel
-------------------------------------------
Userspace gets a regulatory domain in the kernel by having
a userspace agent build it and send it via nl80211. Only
expected regulatory domains will be respected by the kernel.
A currently available userspace agent which can accomplish this
is CRDA - central regulatory domain agent. Its documented here:
http://wireless.kernel.org/en/developers/Regulatory/CRDA
Essentially the kernel will send a udev event when it knows
it needs a new regulatory domain. A udev rule can be put in place
to trigger crda to send the respective regulatory domain for a
specific ISO/IEC 3166 alpha2.
Below is an example udev rule which can be used:
# Example file, should be put in /etc/udev/rules.d/regulatory.rules
KERNEL=="regulatory*", ACTION=="change", SUBSYSTEM=="platform", RUN+="/sbin/crda"
The alpha2 is passed as an environment variable under the variable COUNTRY.
Who asks for regulatory domains?
--------------------------------
* Users
Users can use iw:
http://wireless.kernel.org/en/users/Documentation/iw
An example:
# set regulatory domain to "Costa Rica"
iw reg set CR
This will request the kernel to set the regulatory domain to
the specificied alpha2. The kernel in turn will then ask userspace
to provide a regulatory domain for the alpha2 specified by the user
by sending a uevent.
* Wireless subsystems for Country Information elements
The kernel will send a uevent to inform userspace a new
regulatory domain is required. More on this to be added
as its integration is added.
* Drivers
If drivers determine they need a specific regulatory domain
set they can inform the wireless core using regulatory_hint().
They have two options -- they either provide an alpha2 so that
crda can provide back a regulatory domain for that country or
they can build their own regulatory domain based on internal
custom knowledge so the wireless core can respect it.
*Most* drivers will rely on the first mechanism of providing a
regulatory hint with an alpha2. For these drivers there is an additional
check that can be used to ensure compliance based on custom EEPROM
regulatory data. This additional check can be used by drivers by
registering on its struct wiphy a reg_notifier() callback. This notifier
is called when the core's regulatory domain has been changed. The driver
can use this to review the changes made and also review who made them
(driver, user, country IE) and determine what to allow based on its
internal EEPROM data. Devices drivers wishing to be capable of world
roaming should use this callback. More on world roaming will be
added to this document when its support is enabled.
Device drivers who provide their own built regulatory domain
do not need a callback as the channels registered by them are
the only ones that will be allowed and therefore *additional*
channels cannot be enabled.
Example code - drivers hinting an alpha2:
------------------------------------------
This example comes from the zd1211rw device driver. You can start
by having a mapping of your device's EEPROM country/regulatory
domain value to a specific alpha2 as follows:
static struct zd_reg_alpha2_map reg_alpha2_map[] = {
{ ZD_REGDOMAIN_FCC, "US" },
{ ZD_REGDOMAIN_IC, "CA" },
{ ZD_REGDOMAIN_ETSI, "DE" }, /* Generic ETSI, use most restrictive */
{ ZD_REGDOMAIN_JAPAN, "JP" },
{ ZD_REGDOMAIN_JAPAN_ADD, "JP" },
{ ZD_REGDOMAIN_SPAIN, "ES" },
{ ZD_REGDOMAIN_FRANCE, "FR" },
Then you can define a routine to map your read EEPROM value to an alpha2,
as follows:
static int zd_reg2alpha2(u8 regdomain, char *alpha2)
{
unsigned int i;
struct zd_reg_alpha2_map *reg_map;
for (i = 0; i < ARRAY_SIZE(reg_alpha2_map); i++) {
reg_map = &reg_alpha2_map[i];
if (regdomain == reg_map->reg) {
alpha2[0] = reg_map->alpha2[0];
alpha2[1] = reg_map->alpha2[1];
return 0;
}
}
return 1;
}
Lastly, you can then hint to the core of your discovered alpha2, if a match
was found. You need to do this after you have registered your wiphy. You
are expected to do this during initialization.
r = zd_reg2alpha2(mac->regdomain, alpha2);
if (!r)
regulatory_hint(hw->wiphy, alpha2);
Example code - drivers providing a built in regulatory domain:
--------------------------------------------------------------
[NOTE: This API is not currently available, it can be added when required]
If you have regulatory information you can obtain from your
driver and you *need* to use this we let you build a regulatory domain
structure and pass it to the wireless core. To do this you should
kmalloc() a structure big enough to hold your regulatory domain
structure and you should then fill it with your data. Finally you simply
call regulatory_hint() with the regulatory domain structure in it.
Bellow is a simple example, with a regulatory domain cached using the stack.
Your implementation may vary (read EEPROM cache instead, for example).
Example cache of some regulatory domain
struct ieee80211_regdomain mydriver_jp_regdom = {
.n_reg_rules = 3,
.alpha2 = "JP",
//.alpha2 = "99", /* If I have no alpha2 to map it to */
.reg_rules = {
/* IEEE 802.11b/g, channels 1..14 */
REG_RULE(2412-20, 2484+20, 40, 6, 20, 0),
/* IEEE 802.11a, channels 34..48 */
REG_RULE(5170-20, 5240+20, 40, 6, 20,
NL80211_RRF_NO_IR),
/* IEEE 802.11a, channels 52..64 */
REG_RULE(5260-20, 5320+20, 40, 6, 20,
NL80211_RRF_NO_IR|
NL80211_RRF_DFS),
}
};
Then in some part of your code after your wiphy has been registered:
struct ieee80211_regdomain *rd;
int size_of_regd;
int num_rules = mydriver_jp_regdom.n_reg_rules;
unsigned int i;
size_of_regd = sizeof(struct ieee80211_regdomain) +
(num_rules * sizeof(struct ieee80211_reg_rule));
rd = kzalloc(size_of_regd, GFP_KERNEL);
if (!rd)
return -ENOMEM;
memcpy(rd, &mydriver_jp_regdom, sizeof(struct ieee80211_regdomain));
for (i=0; i < num_rules; i++)
memcpy(&rd->reg_rules[i],
&mydriver_jp_regdom.reg_rules[i],
sizeof(struct ieee80211_reg_rule));
regulatory_struct_hint(rd);
Statically compiled regulatory database
---------------------------------------
In most situations the userland solution using CRDA as described
above is the preferred solution. However in some cases a set of
rules built into the kernel itself may be desirable. To account
for this situation, a configuration option has been provided
(i.e. CONFIG_CFG80211_INTERNAL_REGDB). With this option enabled,
the wireless database information contained in net/wireless/db.txt is
used to generate a data structure encoded in net/wireless/regdb.c.
That option also enables code in net/wireless/reg.c which queries
the data in regdb.c as an alternative to using CRDA.
The file net/wireless/db.txt should be kept up-to-date with the db.txt
file available in the git repository here:
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-regdb.git
Again, most users in most situations should be using the CRDA package
provided with their distribution, and in most other situations users
should be building and using CRDA on their own rather than using
this option. If you are not absolutely sure that you should be using
CONFIG_CFG80211_INTERNAL_REGDB then _DO_NOT_USE_IT_.

View file

@ -0,0 +1,947 @@
======================
RxRPC NETWORK PROTOCOL
======================
The RxRPC protocol driver provides a reliable two-phase transport on top of UDP
that can be used to perform RxRPC remote operations. This is done over sockets
of AF_RXRPC family, using sendmsg() and recvmsg() with control data to send and
receive data, aborts and errors.
Contents of this document:
(*) Overview.
(*) RxRPC protocol summary.
(*) AF_RXRPC driver model.
(*) Control messages.
(*) Socket options.
(*) Security.
(*) Example client usage.
(*) Example server usage.
(*) AF_RXRPC kernel interface.
(*) Configurable parameters.
========
OVERVIEW
========
RxRPC is a two-layer protocol. There is a session layer which provides
reliable virtual connections using UDP over IPv4 (or IPv6) as the transport
layer, but implements a real network protocol; and there's the presentation
layer which renders structured data to binary blobs and back again using XDR
(as does SunRPC):
+-------------+
| Application |
+-------------+
| XDR | Presentation
+-------------+
| RxRPC | Session
+-------------+
| UDP | Transport
+-------------+
AF_RXRPC provides:
(1) Part of an RxRPC facility for both kernel and userspace applications by
making the session part of it a Linux network protocol (AF_RXRPC).
(2) A two-phase protocol. The client transmits a blob (the request) and then
receives a blob (the reply), and the server receives the request and then
transmits the reply.
(3) Retention of the reusable bits of the transport system set up for one call
to speed up subsequent calls.
(4) A secure protocol, using the Linux kernel's key retention facility to
manage security on the client end. The server end must of necessity be
more active in security negotiations.
AF_RXRPC does not provide XDR marshalling/presentation facilities. That is
left to the application. AF_RXRPC only deals in blobs. Even the operation ID
is just the first four bytes of the request blob, and as such is beyond the
kernel's interest.
Sockets of AF_RXRPC family are:
(1) created as type SOCK_DGRAM;
(2) provided with a protocol of the type of underlying transport they're going
to use - currently only PF_INET is supported.
The Andrew File System (AFS) is an example of an application that uses this and
that has both kernel (filesystem) and userspace (utility) components.
======================
RXRPC PROTOCOL SUMMARY
======================
An overview of the RxRPC protocol:
(*) RxRPC sits on top of another networking protocol (UDP is the only option
currently), and uses this to provide network transport. UDP ports, for
example, provide transport endpoints.
(*) RxRPC supports multiple virtual "connections" from any given transport
endpoint, thus allowing the endpoints to be shared, even to the same
remote endpoint.
(*) Each connection goes to a particular "service". A connection may not go
to multiple services. A service may be considered the RxRPC equivalent of
a port number. AF_RXRPC permits multiple services to share an endpoint.
(*) Client-originating packets are marked, thus a transport endpoint can be
shared between client and server connections (connections have a
direction).
(*) Up to a billion connections may be supported concurrently between one
local transport endpoint and one service on one remote endpoint. An RxRPC
connection is described by seven numbers:
Local address }
Local port } Transport (UDP) address
Remote address }
Remote port }
Direction
Connection ID
Service ID
(*) Each RxRPC operation is a "call". A connection may make up to four
billion calls, but only up to four calls may be in progress on a
connection at any one time.
(*) Calls are two-phase and asymmetric: the client sends its request data,
which the service receives; then the service transmits the reply data
which the client receives.
(*) The data blobs are of indefinite size, the end of a phase is marked with a
flag in the packet. The number of packets of data making up one blob may
not exceed 4 billion, however, as this would cause the sequence number to
wrap.
(*) The first four bytes of the request data are the service operation ID.
(*) Security is negotiated on a per-connection basis. The connection is
initiated by the first data packet on it arriving. If security is
requested, the server then issues a "challenge" and then the client
replies with a "response". If the response is successful, the security is
set for the lifetime of that connection, and all subsequent calls made
upon it use that same security. In the event that the server lets a
connection lapse before the client, the security will be renegotiated if
the client uses the connection again.
(*) Calls use ACK packets to handle reliability. Data packets are also
explicitly sequenced per call.
(*) There are two types of positive acknowledgment: hard-ACKs and soft-ACKs.
A hard-ACK indicates to the far side that all the data received to a point
has been received and processed; a soft-ACK indicates that the data has
been received but may yet be discarded and re-requested. The sender may
not discard any transmittable packets until they've been hard-ACK'd.
(*) Reception of a reply data packet implicitly hard-ACK's all the data
packets that make up the request.
(*) An call is complete when the request has been sent, the reply has been
received and the final hard-ACK on the last packet of the reply has
reached the server.
(*) An call may be aborted by either end at any time up to its completion.
=====================
AF_RXRPC DRIVER MODEL
=====================
About the AF_RXRPC driver:
(*) The AF_RXRPC protocol transparently uses internal sockets of the transport
protocol to represent transport endpoints.
(*) AF_RXRPC sockets map onto RxRPC connection bundles. Actual RxRPC
connections are handled transparently. One client socket may be used to
make multiple simultaneous calls to the same service. One server socket
may handle calls from many clients.
(*) Additional parallel client connections will be initiated to support extra
concurrent calls, up to a tunable limit.
(*) Each connection is retained for a certain amount of time [tunable] after
the last call currently using it has completed in case a new call is made
that could reuse it.
(*) Each internal UDP socket is retained [tunable] for a certain amount of
time [tunable] after the last connection using it discarded, in case a new
connection is made that could use it.
(*) A client-side connection is only shared between calls if they have have
the same key struct describing their security (and assuming the calls
would otherwise share the connection). Non-secured calls would also be
able to share connections with each other.
(*) A server-side connection is shared if the client says it is.
(*) ACK'ing is handled by the protocol driver automatically, including ping
replying.
(*) SO_KEEPALIVE automatically pings the other side to keep the connection
alive [TODO].
(*) If an ICMP error is received, all calls affected by that error will be
aborted with an appropriate network error passed through recvmsg().
Interaction with the user of the RxRPC socket:
(*) A socket is made into a server socket by binding an address with a
non-zero service ID.
(*) In the client, sending a request is achieved with one or more sendmsgs,
followed by the reply being received with one or more recvmsgs.
(*) The first sendmsg for a request to be sent from a client contains a tag to
be used in all other sendmsgs or recvmsgs associated with that call. The
tag is carried in the control data.
(*) connect() is used to supply a default destination address for a client
socket. This may be overridden by supplying an alternate address to the
first sendmsg() of a call (struct msghdr::msg_name).
(*) If connect() is called on an unbound client, a random local port will
bound before the operation takes place.
(*) A server socket may also be used to make client calls. To do this, the
first sendmsg() of the call must specify the target address. The server's
transport endpoint is used to send the packets.
(*) Once the application has received the last message associated with a call,
the tag is guaranteed not to be seen again, and so it can be used to pin
client resources. A new call can then be initiated with the same tag
without fear of interference.
(*) In the server, a request is received with one or more recvmsgs, then the
the reply is transmitted with one or more sendmsgs, and then the final ACK
is received with a last recvmsg.
(*) When sending data for a call, sendmsg is given MSG_MORE if there's more
data to come on that call.
(*) When receiving data for a call, recvmsg flags MSG_MORE if there's more
data to come for that call.
(*) When receiving data or messages for a call, MSG_EOR is flagged by recvmsg
to indicate the terminal message for that call.
(*) A call may be aborted by adding an abort control message to the control
data. Issuing an abort terminates the kernel's use of that call's tag.
Any messages waiting in the receive queue for that call will be discarded.
(*) Aborts, busy notifications and challenge packets are delivered by recvmsg,
and control data messages will be set to indicate the context. Receiving
an abort or a busy message terminates the kernel's use of that call's tag.
(*) The control data part of the msghdr struct is used for a number of things:
(*) The tag of the intended or affected call.
(*) Sending or receiving errors, aborts and busy notifications.
(*) Notifications of incoming calls.
(*) Sending debug requests and receiving debug replies [TODO].
(*) When the kernel has received and set up an incoming call, it sends a
message to server application to let it know there's a new call awaiting
its acceptance [recvmsg reports a special control message]. The server
application then uses sendmsg to assign a tag to the new call. Once that
is done, the first part of the request data will be delivered by recvmsg.
(*) The server application has to provide the server socket with a keyring of
secret keys corresponding to the security types it permits. When a secure
connection is being set up, the kernel looks up the appropriate secret key
in the keyring and then sends a challenge packet to the client and
receives a response packet. The kernel then checks the authorisation of
the packet and either aborts the connection or sets up the security.
(*) The name of the key a client will use to secure its communications is
nominated by a socket option.
Notes on recvmsg:
(*) If there's a sequence of data messages belonging to a particular call on
the receive queue, then recvmsg will keep working through them until:
(a) it meets the end of that call's received data,
(b) it meets a non-data message,
(c) it meets a message belonging to a different call, or
(d) it fills the user buffer.
If recvmsg is called in blocking mode, it will keep sleeping, awaiting the
reception of further data, until one of the above four conditions is met.
(2) MSG_PEEK operates similarly, but will return immediately if it has put any
data in the buffer rather than sleeping until it can fill the buffer.
(3) If a data message is only partially consumed in filling a user buffer,
then the remainder of that message will be left on the front of the queue
for the next taker. MSG_TRUNC will never be flagged.
(4) If there is more data to be had on a call (it hasn't copied the last byte
of the last data message in that phase yet), then MSG_MORE will be
flagged.
================
CONTROL MESSAGES
================
AF_RXRPC makes use of control messages in sendmsg() and recvmsg() to multiplex
calls, to invoke certain actions and to report certain conditions. These are:
MESSAGE ID SRT DATA MEANING
======================= === =========== ===============================
RXRPC_USER_CALL_ID sr- User ID App's call specifier
RXRPC_ABORT srt Abort code Abort code to issue/received
RXRPC_ACK -rt n/a Final ACK received
RXRPC_NET_ERROR -rt error num Network error on call
RXRPC_BUSY -rt n/a Call rejected (server busy)
RXRPC_LOCAL_ERROR -rt error num Local error encountered
RXRPC_NEW_CALL -r- n/a New call received
RXRPC_ACCEPT s-- n/a Accept new call
(SRT = usable in Sendmsg / delivered by Recvmsg / Terminal message)
(*) RXRPC_USER_CALL_ID
This is used to indicate the application's call ID. It's an unsigned long
that the app specifies in the client by attaching it to the first data
message or in the server by passing it in association with an RXRPC_ACCEPT
message. recvmsg() passes it in conjunction with all messages except
those of the RXRPC_NEW_CALL message.
(*) RXRPC_ABORT
This is can be used by an application to abort a call by passing it to
sendmsg, or it can be delivered by recvmsg to indicate a remote abort was
received. Either way, it must be associated with an RXRPC_USER_CALL_ID to
specify the call affected. If an abort is being sent, then error EBADSLT
will be returned if there is no call with that user ID.
(*) RXRPC_ACK
This is delivered to a server application to indicate that the final ACK
of a call was received from the client. It will be associated with an
RXRPC_USER_CALL_ID to indicate the call that's now complete.
(*) RXRPC_NET_ERROR
This is delivered to an application to indicate that an ICMP error message
was encountered in the process of trying to talk to the peer. An
errno-class integer value will be included in the control message data
indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
affected.
(*) RXRPC_BUSY
This is delivered to a client application to indicate that a call was
rejected by the server due to the server being busy. It will be
associated with an RXRPC_USER_CALL_ID to indicate the rejected call.
(*) RXRPC_LOCAL_ERROR
This is delivered to an application to indicate that a local error was
encountered and that a call has been aborted because of it. An
errno-class integer value will be included in the control message data
indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
affected.
(*) RXRPC_NEW_CALL
This is delivered to indicate to a server application that a new call has
arrived and is awaiting acceptance. No user ID is associated with this,
as a user ID must subsequently be assigned by doing an RXRPC_ACCEPT.
(*) RXRPC_ACCEPT
This is used by a server application to attempt to accept a call and
assign it a user ID. It should be associated with an RXRPC_USER_CALL_ID
to indicate the user ID to be assigned. If there is no call to be
accepted (it may have timed out, been aborted, etc.), then sendmsg will
return error ENODATA. If the user ID is already in use by another call,
then error EBADSLT will be returned.
==============
SOCKET OPTIONS
==============
AF_RXRPC sockets support a few socket options at the SOL_RXRPC level:
(*) RXRPC_SECURITY_KEY
This is used to specify the description of the key to be used. The key is
extracted from the calling process's keyrings with request_key() and
should be of "rxrpc" type.
The optval pointer points to the description string, and optlen indicates
how long the string is, without the NUL terminator.
(*) RXRPC_SECURITY_KEYRING
Similar to above but specifies a keyring of server secret keys to use (key
type "keyring"). See the "Security" section.
(*) RXRPC_EXCLUSIVE_CONNECTION
This is used to request that new connections should be used for each call
made subsequently on this socket. optval should be NULL and optlen 0.
(*) RXRPC_MIN_SECURITY_LEVEL
This is used to specify the minimum security level required for calls on
this socket. optval must point to an int containing one of the following
values:
(a) RXRPC_SECURITY_PLAIN
Encrypted checksum only.
(b) RXRPC_SECURITY_AUTH
Encrypted checksum plus packet padded and first eight bytes of packet
encrypted - which includes the actual packet length.
(c) RXRPC_SECURITY_ENCRYPTED
Encrypted checksum plus entire packet padded and encrypted, including
actual packet length.
========
SECURITY
========
Currently, only the kerberos 4 equivalent protocol has been implemented
(security index 2 - rxkad). This requires the rxkad module to be loaded and,
on the client, tickets of the appropriate type to be obtained from the AFS
kaserver or the kerberos server and installed as "rxrpc" type keys. This is
normally done using the klog program. An example simple klog program can be
found at:
http://people.redhat.com/~dhowells/rxrpc/klog.c
The payload provided to add_key() on the client should be of the following
form:
struct rxrpc_key_sec2_v1 {
uint16_t security_index; /* 2 */
uint16_t ticket_length; /* length of ticket[] */
uint32_t expiry; /* time at which expires */
uint8_t kvno; /* key version number */
uint8_t __pad[3];
uint8_t session_key[8]; /* DES session key */
uint8_t ticket[0]; /* the encrypted ticket */
};
Where the ticket blob is just appended to the above structure.
For the server, keys of type "rxrpc_s" must be made available to the server.
They have a description of "<serviceID>:<securityIndex>" (eg: "52:2" for an
rxkad key for the AFS VL service). When such a key is created, it should be
given the server's secret key as the instantiation data (see the example
below).
add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
A keyring is passed to the server socket by naming it in a sockopt. The server
socket then looks the server secret keys up in this keyring when secure
incoming connections are made. This can be seen in an example program that can
be found at:
http://people.redhat.com/~dhowells/rxrpc/listen.c
====================
EXAMPLE CLIENT USAGE
====================
A client would issue an operation by:
(1) An RxRPC socket is set up by:
client = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
Where the third parameter indicates the protocol family of the transport
socket used - usually IPv4 but it can also be IPv6 [TODO].
(2) A local address can optionally be bound:
struct sockaddr_rxrpc srx = {
.srx_family = AF_RXRPC,
.srx_service = 0, /* we're a client */
.transport_type = SOCK_DGRAM, /* type of transport socket */
.transport.sin_family = AF_INET,
.transport.sin_port = htons(7000), /* AFS callback */
.transport.sin_address = 0, /* all local interfaces */
};
bind(client, &srx, sizeof(srx));
This specifies the local UDP port to be used. If not given, a random
non-privileged port will be used. A UDP port may be shared between
several unrelated RxRPC sockets. Security is handled on a basis of
per-RxRPC virtual connection.
(3) The security is set:
const char *key = "AFS:cambridge.redhat.com";
setsockopt(client, SOL_RXRPC, RXRPC_SECURITY_KEY, key, strlen(key));
This issues a request_key() to get the key representing the security
context. The minimum security level can be set:
unsigned int sec = RXRPC_SECURITY_ENCRYPTED;
setsockopt(client, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL,
&sec, sizeof(sec));
(4) The server to be contacted can then be specified (alternatively this can
be done through sendmsg):
struct sockaddr_rxrpc srx = {
.srx_family = AF_RXRPC,
.srx_service = VL_SERVICE_ID,
.transport_type = SOCK_DGRAM, /* type of transport socket */
.transport.sin_family = AF_INET,
.transport.sin_port = htons(7005), /* AFS volume manager */
.transport.sin_address = ...,
};
connect(client, &srx, sizeof(srx));
(5) The request data should then be posted to the server socket using a series
of sendmsg() calls, each with the following control message attached:
RXRPC_USER_CALL_ID - specifies the user ID for this call
MSG_MORE should be set in msghdr::msg_flags on all but the last part of
the request. Multiple requests may be made simultaneously.
If a call is intended to go to a destination other than the default
specified through connect(), then msghdr::msg_name should be set on the
first request message of that call.
(6) The reply data will then be posted to the server socket for recvmsg() to
pick up. MSG_MORE will be flagged by recvmsg() if there's more reply data
for a particular call to be read. MSG_EOR will be set on the terminal
read for a call.
All data will be delivered with the following control message attached:
RXRPC_USER_CALL_ID - specifies the user ID for this call
If an abort or error occurred, this will be returned in the control data
buffer instead, and MSG_EOR will be flagged to indicate the end of that
call.
====================
EXAMPLE SERVER USAGE
====================
A server would be set up to accept operations in the following manner:
(1) An RxRPC socket is created by:
server = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
Where the third parameter indicates the address type of the transport
socket used - usually IPv4.
(2) Security is set up if desired by giving the socket a keyring with server
secret keys in it:
keyring = add_key("keyring", "AFSkeys", NULL, 0,
KEY_SPEC_PROCESS_KEYRING);
const char secret_key[8] = {
0xa7, 0x83, 0x8a, 0xcb, 0xc7, 0x83, 0xec, 0x94 };
add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
setsockopt(server, SOL_RXRPC, RXRPC_SECURITY_KEYRING, "AFSkeys", 7);
The keyring can be manipulated after it has been given to the socket. This
permits the server to add more keys, replace keys, etc. whilst it is live.
(2) A local address must then be bound:
struct sockaddr_rxrpc srx = {
.srx_family = AF_RXRPC,
.srx_service = VL_SERVICE_ID, /* RxRPC service ID */
.transport_type = SOCK_DGRAM, /* type of transport socket */
.transport.sin_family = AF_INET,
.transport.sin_port = htons(7000), /* AFS callback */
.transport.sin_address = 0, /* all local interfaces */
};
bind(server, &srx, sizeof(srx));
(3) The server is then set to listen out for incoming calls:
listen(server, 100);
(4) The kernel notifies the server of pending incoming connections by sending
it a message for each. This is received with recvmsg() on the server
socket. It has no data, and has a single dataless control message
attached:
RXRPC_NEW_CALL
The address that can be passed back by recvmsg() at this point should be
ignored since the call for which the message was posted may have gone by
the time it is accepted - in which case the first call still on the queue
will be accepted.
(5) The server then accepts the new call by issuing a sendmsg() with two
pieces of control data and no actual data:
RXRPC_ACCEPT - indicate connection acceptance
RXRPC_USER_CALL_ID - specify user ID for this call
(6) The first request data packet will then be posted to the server socket for
recvmsg() to pick up. At that point, the RxRPC address for the call can
be read from the address fields in the msghdr struct.
Subsequent request data will be posted to the server socket for recvmsg()
to collect as it arrives. All but the last piece of the request data will
be delivered with MSG_MORE flagged.
All data will be delivered with the following control message attached:
RXRPC_USER_CALL_ID - specifies the user ID for this call
(8) The reply data should then be posted to the server socket using a series
of sendmsg() calls, each with the following control messages attached:
RXRPC_USER_CALL_ID - specifies the user ID for this call
MSG_MORE should be set in msghdr::msg_flags on all but the last message
for a particular call.
(9) The final ACK from the client will be posted for retrieval by recvmsg()
when it is received. It will take the form of a dataless message with two
control messages attached:
RXRPC_USER_CALL_ID - specifies the user ID for this call
RXRPC_ACK - indicates final ACK (no data)
MSG_EOR will be flagged to indicate that this is the final message for
this call.
(10) Up to the point the final packet of reply data is sent, the call can be
aborted by calling sendmsg() with a dataless message with the following
control messages attached:
RXRPC_USER_CALL_ID - specifies the user ID for this call
RXRPC_ABORT - indicates abort code (4 byte data)
Any packets waiting in the socket's receive queue will be discarded if
this is issued.
Note that all the communications for a particular service take place through
the one server socket, using control messages on sendmsg() and recvmsg() to
determine the call affected.
=========================
AF_RXRPC KERNEL INTERFACE
=========================
The AF_RXRPC module also provides an interface for use by in-kernel utilities
such as the AFS filesystem. This permits such a utility to:
(1) Use different keys directly on individual client calls on one socket
rather than having to open a whole slew of sockets, one for each key it
might want to use.
(2) Avoid having RxRPC call request_key() at the point of issue of a call or
opening of a socket. Instead the utility is responsible for requesting a
key at the appropriate point. AFS, for instance, would do this during VFS
operations such as open() or unlink(). The key is then handed through
when the call is initiated.
(3) Request the use of something other than GFP_KERNEL to allocate memory.
(4) Avoid the overhead of using the recvmsg() call. RxRPC messages can be
intercepted before they get put into the socket Rx queue and the socket
buffers manipulated directly.
To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
bind an address as appropriate and listen if it's to be a server socket, but
then it passes this to the kernel interface functions.
The kernel interface functions are as follows:
(*) Begin a new client call.
struct rxrpc_call *
rxrpc_kernel_begin_call(struct socket *sock,
struct sockaddr_rxrpc *srx,
struct key *key,
unsigned long user_call_ID,
gfp_t gfp);
This allocates the infrastructure to make a new RxRPC call and assigns
call and connection numbers. The call will be made on the UDP port that
the socket is bound to. The call will go to the destination address of a
connected client socket unless an alternative is supplied (srx is
non-NULL).
If a key is supplied then this will be used to secure the call instead of
the key bound to the socket with the RXRPC_SECURITY_KEY sockopt. Calls
secured in this way will still share connections if at all possible.
The user_call_ID is equivalent to that supplied to sendmsg() in the
control data buffer. It is entirely feasible to use this to point to a
kernel data structure.
If this function is successful, an opaque reference to the RxRPC call is
returned. The caller now holds a reference on this and it must be
properly ended.
(*) End a client call.
void rxrpc_kernel_end_call(struct rxrpc_call *call);
This is used to end a previously begun call. The user_call_ID is expunged
from AF_RXRPC's knowledge and will not be seen again in association with
the specified call.
(*) Send data through a call.
int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg,
size_t len);
This is used to supply either the request part of a client call or the
reply part of a server call. msg.msg_iovlen and msg.msg_iov specify the
data buffers to be used. msg_iov may not be NULL and must point
exclusively to in-kernel virtual addresses. msg.msg_flags may be given
MSG_MORE if there will be subsequent data sends for this call.
The msg must not specify a destination address, control data or any flags
other than MSG_MORE. len is the total amount of data to transmit.
(*) Abort a call.
void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code);
This is used to abort a call if it's still in an abortable state. The
abort code specified will be placed in the ABORT message sent.
(*) Intercept received RxRPC messages.
typedef void (*rxrpc_interceptor_t)(struct sock *sk,
unsigned long user_call_ID,
struct sk_buff *skb);
void
rxrpc_kernel_intercept_rx_messages(struct socket *sock,
rxrpc_interceptor_t interceptor);
This installs an interceptor function on the specified AF_RXRPC socket.
All messages that would otherwise wind up in the socket's Rx queue are
then diverted to this function. Note that care must be taken to process
the messages in the right order to maintain DATA message sequentiality.
The interceptor function itself is provided with the address of the socket
and handling the incoming message, the ID assigned by the kernel utility
to the call and the socket buffer containing the message.
The skb->mark field indicates the type of message:
MARK MEANING
=============================== =======================================
RXRPC_SKB_MARK_DATA Data message
RXRPC_SKB_MARK_FINAL_ACK Final ACK received for an incoming call
RXRPC_SKB_MARK_BUSY Client call rejected as server busy
RXRPC_SKB_MARK_REMOTE_ABORT Call aborted by peer
RXRPC_SKB_MARK_NET_ERROR Network error detected
RXRPC_SKB_MARK_LOCAL_ERROR Local error encountered
RXRPC_SKB_MARK_NEW_CALL New incoming call awaiting acceptance
The remote abort message can be probed with rxrpc_kernel_get_abort_code().
The two error messages can be probed with rxrpc_kernel_get_error_number().
A new call can be accepted with rxrpc_kernel_accept_call().
Data messages can have their contents extracted with the usual bunch of
socket buffer manipulation functions. A data message can be determined to
be the last one in a sequence with rxrpc_kernel_is_data_last(). When a
data message has been used up, rxrpc_kernel_data_delivered() should be
called on it..
Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose
of. It is possible to get extra refs on all types of message for later
freeing, but this may pin the state of a call until the message is finally
freed.
(*) Accept an incoming call.
struct rxrpc_call *
rxrpc_kernel_accept_call(struct socket *sock,
unsigned long user_call_ID);
This is used to accept an incoming call and to assign it a call ID. This
function is similar to rxrpc_kernel_begin_call() and calls accepted must
be ended in the same way.
If this function is successful, an opaque reference to the RxRPC call is
returned. The caller now holds a reference on this and it must be
properly ended.
(*) Reject an incoming call.
int rxrpc_kernel_reject_call(struct socket *sock);
This is used to reject the first incoming call on the socket's queue with
a BUSY message. -ENODATA is returned if there were no incoming calls.
Other errors may be returned if the call had been aborted (-ECONNABORTED)
or had timed out (-ETIME).
(*) Record the delivery of a data message and free it.
void rxrpc_kernel_data_delivered(struct sk_buff *skb);
This is used to record a data message as having been delivered and to
update the ACK state for the call. The socket buffer will be freed.
(*) Free a message.
void rxrpc_kernel_free_skb(struct sk_buff *skb);
This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC
socket.
(*) Determine if a data message is the last one on a call.
bool rxrpc_kernel_is_data_last(struct sk_buff *skb);
This is used to determine if a socket buffer holds the last data message
to be received for a call (true will be returned if it does, false
if not).
The data message will be part of the reply on a client call and the
request on an incoming call. In the latter case there will be more
messages, but in the former case there will not.
(*) Get the abort code from an abort message.
u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb);
This is used to extract the abort code from a remote abort message.
(*) Get the error number from a local or network error message.
int rxrpc_kernel_get_error_number(struct sk_buff *skb);
This is used to extract the error number from a message indicating either
a local error occurred or a network error occurred.
(*) Allocate a null key for doing anonymous security.
struct key *rxrpc_get_null_key(const char *keyname);
This is used to allocate a null RxRPC key that can be used to indicate
anonymous security for a particular domain.
=======================
CONFIGURABLE PARAMETERS
=======================
The RxRPC protocol driver has a number of configurable parameters that can be
adjusted through sysctls in /proc/net/rxrpc/:
(*) req_ack_delay
The amount of time in milliseconds after receiving a packet with the
request-ack flag set before we honour the flag and actually send the
requested ack.
Usually the other side won't stop sending packets until the advertised
reception window is full (to a maximum of 255 packets), so delaying the
ACK permits several packets to be ACK'd in one go.
(*) soft_ack_delay
The amount of time in milliseconds after receiving a new packet before we
generate a soft-ACK to tell the sender that it doesn't need to resend.
(*) idle_ack_delay
The amount of time in milliseconds after all the packets currently in the
received queue have been consumed before we generate a hard-ACK to tell
the sender it can free its buffers, assuming no other reason occurs that
we would send an ACK.
(*) resend_timeout
The amount of time in milliseconds after transmitting a packet before we
transmit it again, assuming no ACK is received from the receiver telling
us they got it.
(*) max_call_lifetime
The maximum amount of time in seconds that a call may be in progress
before we preemptively kill it.
(*) dead_call_expiry
The amount of time in seconds before we remove a dead call from the call
list. Dead calls are kept around for a little while for the purpose of
repeating ACK and ABORT packets.
(*) connection_expiry
The amount of time in seconds after a connection was last used before we
remove it from the connection list. Whilst a connection is in existence,
it serves as a placeholder for negotiated security; when it is deleted,
the security must be renegotiated.
(*) transport_expiry
The amount of time in seconds after a transport was last used before we
remove it from the transport list. Whilst a transport is in existence, it
serves to anchor the peer data and keeps the connection ID counter.
(*) rxrpc_rx_window_size
The size of the receive window in packets. This is the maximum number of
unconsumed received packets we're willing to hold in memory for any
particular call.
(*) rxrpc_rx_mtu
The maximum packet MTU size that we're willing to receive in bytes. This
indicates to the peer whether we're willing to accept jumbo packets.
(*) rxrpc_rx_jumbo_max
The maximum number of packets that we're willing to accept in a jumbo
packet. Non-terminal packets in a jumbo packet must contain a four byte
header plus exactly 1412 bytes of data. The terminal packet must contain
a four byte header plus any amount of data. In any event, a jumbo packet
may not exceed rxrpc_rx_mtu in size.

View file

@ -0,0 +1,141 @@
Release notes for Neterion's (Formerly S2io) Xframe I/II PCI-X 10GbE driver.
Contents
=======
- 1. Introduction
- 2. Identifying the adapter/interface
- 3. Features supported
- 4. Command line parameters
- 5. Performance suggestions
- 6. Available Downloads
1. Introduction:
This Linux driver supports Neterion's Xframe I PCI-X 1.0 and
Xframe II PCI-X 2.0 adapters. It supports several features
such as jumbo frames, MSI/MSI-X, checksum offloads, TSO, UFO and so on.
See below for complete list of features.
All features are supported for both IPv4 and IPv6.
2. Identifying the adapter/interface:
a. Insert the adapter(s) in your system.
b. Build and load driver
# insmod s2io.ko
c. View log messages
# dmesg | tail -40
You will see messages similar to:
eth3: Neterion Xframe I 10GbE adapter (rev 3), Version 2.0.9.1, Intr type INTA
eth4: Neterion Xframe II 10GbE adapter (rev 2), Version 2.0.9.1, Intr type INTA
eth4: Device is on 64 bit 133MHz PCIX(M1) bus
The above messages identify the adapter type(Xframe I/II), adapter revision,
driver version, interface name(eth3, eth4), Interrupt type(INTA, MSI, MSI-X).
In case of Xframe II, the PCI/PCI-X bus width and frequency are displayed
as well.
To associate an interface with a physical adapter use "ethtool -p <ethX>".
The corresponding adapter's LED will blink multiple times.
3. Features supported:
a. Jumbo frames. Xframe I/II supports MTU up to 9600 bytes,
modifiable using ifconfig command.
b. Offloads. Supports checksum offload(TCP/UDP/IP) on transmit
and receive, TSO.
c. Multi-buffer receive mode. Scattering of packet across multiple
buffers. Currently driver supports 2-buffer mode which yields
significant performance improvement on certain platforms(SGI Altix,
IBM xSeries).
d. MSI/MSI-X. Can be enabled on platforms which support this feature
(IA64, Xeon) resulting in noticeable performance improvement(up to 7%
on certain platforms).
e. Statistics. Comprehensive MAC-level and software statistics displayed
using "ethtool -S" option.
f. Multi-FIFO/Ring. Supports up to 8 transmit queues and receive rings,
with multiple steering options.
4. Command line parameters
a. tx_fifo_num
Number of transmit queues
Valid range: 1-8
Default: 1
b. rx_ring_num
Number of receive rings
Valid range: 1-8
Default: 1
c. tx_fifo_len
Size of each transmit queue
Valid range: Total length of all queues should not exceed 8192
Default: 4096
d. rx_ring_sz
Size of each receive ring(in 4K blocks)
Valid range: Limited by memory on system
Default: 30
e. intr_type
Specifies interrupt type. Possible values 0(INTA), 2(MSI-X)
Valid values: 0, 2
Default: 2
5. Performance suggestions
General:
a. Set MTU to maximum(9000 for switch setup, 9600 in back-to-back configuration)
b. Set TCP windows size to optimal value.
For instance, for MTU=1500 a value of 210K has been observed to result in
good performance.
# sysctl -w net.ipv4.tcp_rmem="210000 210000 210000"
# sysctl -w net.ipv4.tcp_wmem="210000 210000 210000"
For MTU=9000, TCP window size of 10 MB is recommended.
# sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
# sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"
Transmit performance:
a. By default, the driver respects BIOS settings for PCI bus parameters.
However, you may want to experiment with PCI bus parameters
max-split-transactions(MOST) and MMRBC (use setpci command).
A MOST value of 2 has been found optimal for Opterons and 3 for Itanium.
It could be different for your hardware.
Set MMRBC to 4K**.
For example you can set
For opteron
#setpci -d 17d5:* 62=1d
For Itanium
#setpci -d 17d5:* 62=3d
For detailed description of the PCI registers, please see Xframe User Guide.
b. Ensure Transmit Checksum offload is enabled. Use ethtool to set/verify this
parameter.
c. Turn on TSO(using "ethtool -K")
# ethtool -K <ethX> tso on
Receive performance:
a. By default, the driver respects BIOS settings for PCI bus parameters.
However, you may want to set PCI latency timer to 248.
#setpci -d 17d5:* LATENCY_TIMER=f8
For detailed description of the PCI registers, please see Xframe User Guide.
b. Use 2-buffer mode. This results in large performance boost on
certain platforms(eg. SGI Altix, IBM xSeries).
c. Ensure Receive Checksum offload is enabled. Use "ethtool -K ethX" command to
set/verify this option.
d. Enable NAPI feature(in kernel configuration Device Drivers ---> Network
device support ---> Ethernet (10000 Mbit) ---> S2IO 10Gbe Xframe NIC) to
bring down CPU utilization.
** For AMD opteron platforms with 8131 chipset, MMRBC=1 and MOST=1 are
recommended as safe parameters.
For more information, please review the AMD8131 errata at
http://vip.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/
26310_AMD-8131_HyperTransport_PCI-X_Tunnel_Revision_Guide_rev_3_18.pdf
6. Support
For further support please contact either your 10GbE Xframe NIC vendor (IBM,
HP, SGI etc.)

Some files were not shown because too many files have changed in this diff Show more