mirror of
https://github.com/AetherDroid/android_kernel_samsung_on5xelte.git
synced 2025-10-29 15:28:50 +01:00
Fixed MTP to work with TWRP
This commit is contained in:
commit
f6dfaef42e
50820 changed files with 20846062 additions and 0 deletions
234
Documentation/networking/00-INDEX
Normal file
234
Documentation/networking/00-INDEX
Normal file
|
|
@ -0,0 +1,234 @@
|
|||
00-INDEX
|
||||
- this file
|
||||
3c505.txt
|
||||
- information on the 3Com EtherLink Plus (3c505) driver.
|
||||
3c509.txt
|
||||
- information on the 3Com Etherlink III Series Ethernet cards.
|
||||
6pack.txt
|
||||
- info on the 6pack protocol, an alternative to KISS for AX.25
|
||||
LICENSE.qla3xxx
|
||||
- GPLv2 for QLogic Linux Networking HBA Driver
|
||||
LICENSE.qlge
|
||||
- GPLv2 for QLogic Linux qlge NIC Driver
|
||||
LICENSE.qlcnic
|
||||
- GPLv2 for QLogic Linux qlcnic NIC Driver
|
||||
Makefile
|
||||
- Makefile for docsrc.
|
||||
PLIP.txt
|
||||
- PLIP: The Parallel Line Internet Protocol device driver
|
||||
README.ipw2100
|
||||
- README for the Intel PRO/Wireless 2100 driver.
|
||||
README.ipw2200
|
||||
- README for the Intel PRO/Wireless 2915ABG and 2200BG driver.
|
||||
README.sb1000
|
||||
- info on General Instrument/NextLevel SURFboard1000 cable modem.
|
||||
alias.txt
|
||||
- info on using alias network devices.
|
||||
arcnet-hardware.txt
|
||||
- tons of info on ARCnet, hubs, jumper settings for ARCnet cards, etc.
|
||||
arcnet.txt
|
||||
- info on the using the ARCnet driver itself.
|
||||
atm.txt
|
||||
- info on where to get ATM programs and support for Linux.
|
||||
ax25.txt
|
||||
- info on using AX.25 and NET/ROM code for Linux
|
||||
batman-adv.txt
|
||||
- B.A.T.M.A.N routing protocol on top of layer 2 Ethernet Frames.
|
||||
baycom.txt
|
||||
- info on the driver for Baycom style amateur radio modems
|
||||
bonding.txt
|
||||
- Linux Ethernet Bonding Driver HOWTO: link aggregation in Linux.
|
||||
bridge.txt
|
||||
- where to get user space programs for ethernet bridging with Linux.
|
||||
can.txt
|
||||
- documentation on CAN protocol family.
|
||||
cops.txt
|
||||
- info on the COPS LocalTalk Linux driver
|
||||
cs89x0.txt
|
||||
- the Crystal LAN (CS8900/20-based) Ethernet ISA adapter driver
|
||||
cxacru.txt
|
||||
- Conexant AccessRunner USB ADSL Modem
|
||||
cxacru-cf.py
|
||||
- Conexant AccessRunner USB ADSL Modem configuration file parser
|
||||
cxgb.txt
|
||||
- Release Notes for the Chelsio N210 Linux device driver.
|
||||
dccp.txt
|
||||
- the Datagram Congestion Control Protocol (DCCP) (RFC 4340..42).
|
||||
de4x5.txt
|
||||
- the Digital EtherWORKS DE4?? and DE5?? PCI Ethernet driver
|
||||
decnet.txt
|
||||
- info on using the DECnet networking layer in Linux.
|
||||
dl2k.txt
|
||||
- README for D-Link DL2000-based Gigabit Ethernet Adapters (dl2k.ko).
|
||||
dm9000.txt
|
||||
- README for the Simtec DM9000 Network driver.
|
||||
dmfe.txt
|
||||
- info on the Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver.
|
||||
dns_resolver.txt
|
||||
- The DNS resolver module allows kernel servies to make DNS queries.
|
||||
driver.txt
|
||||
- Softnet driver issues.
|
||||
e100.txt
|
||||
- info on Intel's EtherExpress PRO/100 line of 10/100 boards
|
||||
e1000.txt
|
||||
- info on Intel's E1000 line of gigabit ethernet boards
|
||||
e1000e.txt
|
||||
- README for the Intel Gigabit Ethernet Driver (e1000e).
|
||||
eql.txt
|
||||
- serial IP load balancing
|
||||
fib_trie.txt
|
||||
- Level Compressed Trie (LC-trie) notes: a structure for routing.
|
||||
filter.txt
|
||||
- Linux Socket Filtering
|
||||
fore200e.txt
|
||||
- FORE Systems PCA-200E/SBA-200E ATM NIC driver info.
|
||||
framerelay.txt
|
||||
- info on using Frame Relay/Data Link Connection Identifier (DLCI).
|
||||
gen_stats.txt
|
||||
- Generic networking statistics for netlink users.
|
||||
generic-hdlc.txt
|
||||
- The generic High Level Data Link Control (HDLC) layer.
|
||||
generic_netlink.txt
|
||||
- info on Generic Netlink
|
||||
gianfar.txt
|
||||
- Gianfar Ethernet Driver.
|
||||
i40e.txt
|
||||
- README for the Intel Ethernet Controller XL710 Driver (i40e).
|
||||
i40evf.txt
|
||||
- Short note on the Driver for the Intel(R) XL710 X710 Virtual Function
|
||||
ieee802154.txt
|
||||
- Linux IEEE 802.15.4 implementation, API and drivers
|
||||
igb.txt
|
||||
- README for the Intel Gigabit Ethernet Driver (igb).
|
||||
igbvf.txt
|
||||
- README for the Intel Gigabit Ethernet Driver (igbvf).
|
||||
ip-sysctl.txt
|
||||
- /proc/sys/net/ipv4/* variables
|
||||
ip_dynaddr.txt
|
||||
- IP dynamic address hack e.g. for auto-dialup links
|
||||
ipddp.txt
|
||||
- AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation
|
||||
iphase.txt
|
||||
- Interphase PCI ATM (i)Chip IA Linux driver info.
|
||||
ipsec.txt
|
||||
- Note on not compressing IPSec payload and resulting failed policy check.
|
||||
ipv6.txt
|
||||
- Options to the ipv6 kernel module.
|
||||
ipvs-sysctl.txt
|
||||
- Per-inode explanation of the /proc/sys/net/ipv4/vs interface.
|
||||
irda.txt
|
||||
- where to get IrDA (infrared) utilities and info for Linux.
|
||||
ixgb.txt
|
||||
- README for the Intel 10 Gigabit Ethernet Driver (ixgb).
|
||||
ixgbe.txt
|
||||
- README for the Intel 10 Gigabit Ethernet Driver (ixgbe).
|
||||
ixgbevf.txt
|
||||
- README for the Intel Virtual Function (VF) Driver (ixgbevf).
|
||||
l2tp.txt
|
||||
- User guide to the L2TP tunnel protocol.
|
||||
lapb-module.txt
|
||||
- programming information of the LAPB module.
|
||||
ltpc.txt
|
||||
- the Apple or Farallon LocalTalk PC card driver
|
||||
mac80211-auth-assoc-deauth.txt
|
||||
- authentication and association / deauth-disassoc with max80211
|
||||
mac80211-injection.txt
|
||||
- HOWTO use packet injection with mac80211
|
||||
multiqueue.txt
|
||||
- HOWTO for multiqueue network device support.
|
||||
netconsole.txt
|
||||
- The network console module netconsole.ko: configuration and notes.
|
||||
netdev-FAQ.txt
|
||||
- FAQ describing how to submit net changes to netdev mailing list.
|
||||
netdev-features.txt
|
||||
- Network interface features API description.
|
||||
netdevices.txt
|
||||
- info on network device driver functions exported to the kernel.
|
||||
netif-msg.txt
|
||||
- Design of the network interface message level setting (NETIF_MSG_*).
|
||||
netlink_mmap.txt
|
||||
- memory mapped I/O with netlink
|
||||
nf_conntrack-sysctl.txt
|
||||
- list of netfilter-sysctl knobs.
|
||||
nfc.txt
|
||||
- The Linux Near Field Communication (NFS) subsystem.
|
||||
openvswitch.txt
|
||||
- Open vSwitch developer documentation.
|
||||
operstates.txt
|
||||
- Overview of network interface operational states.
|
||||
packet_mmap.txt
|
||||
- User guide to memory mapped packet socket rings (PACKET_[RT]X_RING).
|
||||
phonet.txt
|
||||
- The Phonet packet protocol used in Nokia cellular modems.
|
||||
phy.txt
|
||||
- The PHY abstraction layer.
|
||||
pktgen.txt
|
||||
- User guide to the kernel packet generator (pktgen.ko).
|
||||
policy-routing.txt
|
||||
- IP policy-based routing
|
||||
ppp_generic.txt
|
||||
- Information about the generic PPP driver.
|
||||
proc_net_tcp.txt
|
||||
- Per inode overview of the /proc/net/tcp and /proc/net/tcp6 interfaces.
|
||||
radiotap-headers.txt
|
||||
- Background on radiotap headers.
|
||||
ray_cs.txt
|
||||
- Raylink Wireless LAN card driver info.
|
||||
rds.txt
|
||||
- Background on the reliable, ordered datagram delivery method RDS.
|
||||
regulatory.txt
|
||||
- Overview of the Linux wireless regulatory infrastructure.
|
||||
rxrpc.txt
|
||||
- Guide to the RxRPC protocol.
|
||||
s2io.txt
|
||||
- Release notes for Neterion Xframe I/II 10GbE driver.
|
||||
scaling.txt
|
||||
- Explanation of network scaling techniques: RSS, RPS, RFS, aRFS, XPS.
|
||||
sctp.txt
|
||||
- Notes on the Linux kernel implementation of the SCTP protocol.
|
||||
secid.txt
|
||||
- Explanation of the secid member in flow structures.
|
||||
skfp.txt
|
||||
- SysKonnect FDDI (SK-5xxx, Compaq Netelligent) driver info.
|
||||
smc9.txt
|
||||
- the driver for SMC's 9000 series of Ethernet cards
|
||||
spider_net.txt
|
||||
- README for the Spidernet Driver (as found in PS3 / Cell BE).
|
||||
stmmac.txt
|
||||
- README for the STMicro Synopsys Ethernet driver.
|
||||
tc-actions-env-rules.txt
|
||||
- rules for traffic control (tc) actions.
|
||||
timestamping.txt
|
||||
- overview of network packet timestamping variants.
|
||||
tcp.txt
|
||||
- short blurb on how TCP output takes place.
|
||||
tcp-thin.txt
|
||||
- kernel tuning options for low rate 'thin' TCP streams.
|
||||
team.txt
|
||||
- pointer to information for ethernet teaming devices.
|
||||
tlan.txt
|
||||
- ThunderLAN (Compaq Netelligent 10/100, Olicom OC-2xxx) driver info.
|
||||
tproxy.txt
|
||||
- Transparent proxy support user guide.
|
||||
tuntap.txt
|
||||
- TUN/TAP device driver, allowing user space Rx/Tx of packets.
|
||||
udplite.txt
|
||||
- UDP-Lite protocol (RFC 3828) introduction.
|
||||
vortex.txt
|
||||
- info on using 3Com Vortex (3c590, 3c592, 3c595, 3c597) Ethernet cards.
|
||||
vxge.txt
|
||||
- README for the Neterion X3100 PCIe Server Adapter.
|
||||
vxlan.txt
|
||||
- Virtual extensible LAN overview
|
||||
x25.txt
|
||||
- general info on X.25 development.
|
||||
x25-iface.txt
|
||||
- description of the X.25 Packet Layer to LAPB device interface.
|
||||
xfrm_proc.txt
|
||||
- description of the statistics package for XFRM.
|
||||
xfrm_sync.txt
|
||||
- sync patches for XFRM enable migration of an SA between hosts.
|
||||
xfrm_sysctl.txt
|
||||
- description of the XFRM configuration options.
|
||||
z8530drv.txt
|
||||
- info about Linux driver for Z8530 based HDLC cards for AX.25
|
||||
213
Documentation/networking/3c509.txt
Normal file
213
Documentation/networking/3c509.txt
Normal file
|
|
@ -0,0 +1,213 @@
|
|||
Linux and the 3Com EtherLink III Series Ethercards (driver v1.18c and higher)
|
||||
----------------------------------------------------------------------------
|
||||
|
||||
This file contains the instructions and caveats for v1.18c and higher versions
|
||||
of the 3c509 driver. You should not use the driver without reading this file.
|
||||
|
||||
release 1.0
|
||||
28 February 2002
|
||||
Current maintainer (corrections to):
|
||||
David Ruggiero <jdr@farfalle.com>
|
||||
|
||||
----------------------------------------------------------------------------
|
||||
|
||||
(0) Introduction
|
||||
|
||||
The following are notes and information on using the 3Com EtherLink III series
|
||||
ethercards in Linux. These cards are commonly known by the most widely-used
|
||||
card's 3Com model number, 3c509. They are all 10mb/s ISA-bus cards and shouldn't
|
||||
be (but sometimes are) confused with the similarly-numbered PCI-bus "3c905"
|
||||
(aka "Vortex" or "Boomerang") series. Kernel support for the 3c509 family is
|
||||
provided by the module 3c509.c, which has code to support all of the following
|
||||
models:
|
||||
|
||||
3c509 (original ISA card)
|
||||
3c509B (later revision of the ISA card; supports full-duplex)
|
||||
3c589 (PCMCIA)
|
||||
3c589B (later revision of the 3c589; supports full-duplex)
|
||||
3c579 (EISA)
|
||||
|
||||
Large portions of this documentation were heavily borrowed from the guide
|
||||
written the original author of the 3c509 driver, Donald Becker. The master
|
||||
copy of that document, which contains notes on older versions of the driver,
|
||||
currently resides on Scyld web server: http://www.scyld.com/.
|
||||
|
||||
|
||||
(1) Special Driver Features
|
||||
|
||||
Overriding card settings
|
||||
|
||||
The driver allows boot- or load-time overriding of the card's detected IOADDR,
|
||||
IRQ, and transceiver settings, although this capability shouldn't generally be
|
||||
needed except to enable full-duplex mode (see below). An example of the syntax
|
||||
for LILO parameters for doing this:
|
||||
|
||||
ether=10,0x310,3,0x3c509,eth0
|
||||
|
||||
This configures the first found 3c509 card for IRQ 10, base I/O 0x310, and
|
||||
transceiver type 3 (10base2). The flag "0x3c509" must be set to avoid conflicts
|
||||
with other card types when overriding the I/O address. When the driver is
|
||||
loaded as a module, only the IRQ may be overridden. For example,
|
||||
setting two cards to IRQ10 and IRQ11 is done by using the irq module
|
||||
option:
|
||||
|
||||
options 3c509 irq=10,11
|
||||
|
||||
|
||||
(2) Full-duplex mode
|
||||
|
||||
The v1.18c driver added support for the 3c509B's full-duplex capabilities.
|
||||
In order to enable and successfully use full-duplex mode, three conditions
|
||||
must be met:
|
||||
|
||||
(a) You must have a Etherlink III card model whose hardware supports full-
|
||||
duplex operations. Currently, the only members of the 3c509 family that are
|
||||
positively known to support full-duplex are the 3c509B (ISA bus) and 3c589B
|
||||
(PCMCIA) cards. Cards without the "B" model designation do *not* support
|
||||
full-duplex mode; these include the original 3c509 (no "B"), the original
|
||||
3c589, the 3c529 (MCA bus), and the 3c579 (EISA bus).
|
||||
|
||||
(b) You must be using your card's 10baseT transceiver (i.e., the RJ-45
|
||||
connector), not its AUI (thick-net) or 10base2 (thin-net/coax) interfaces.
|
||||
AUI and 10base2 network cabling is physically incapable of full-duplex
|
||||
operation.
|
||||
|
||||
(c) Most importantly, your 3c509B must be connected to a link partner that is
|
||||
itself full-duplex capable. This is almost certainly one of two things: a full-
|
||||
duplex-capable Ethernet switch (*not* a hub), or a full-duplex-capable NIC on
|
||||
another system that's connected directly to the 3c509B via a crossover cable.
|
||||
|
||||
Full-duplex mode can be enabled using 'ethtool'.
|
||||
|
||||
/////Extremely important caution concerning full-duplex mode/////
|
||||
Understand that the 3c509B's hardware's full-duplex support is much more
|
||||
limited than that provide by more modern network interface cards. Although
|
||||
at the physical layer of the network it fully supports full-duplex operation,
|
||||
the card was designed before the current Ethernet auto-negotiation (N-way)
|
||||
spec was written. This means that the 3c509B family ***cannot and will not
|
||||
auto-negotiate a full-duplex connection with its link partner under any
|
||||
circumstances, no matter how it is initialized***. If the full-duplex mode
|
||||
of the 3c509B is enabled, its link partner will very likely need to be
|
||||
independently _forced_ into full-duplex mode as well; otherwise various nasty
|
||||
failures will occur - at the very least, you'll see massive numbers of packet
|
||||
collisions. This is one of very rare circumstances where disabling auto-
|
||||
negotiation and forcing the duplex mode of a network interface card or switch
|
||||
would ever be necessary or desirable.
|
||||
|
||||
|
||||
(3) Available Transceiver Types
|
||||
|
||||
For versions of the driver v1.18c and above, the available transceiver types are:
|
||||
|
||||
0 transceiver type from EEPROM config (normally 10baseT); force half-duplex
|
||||
1 AUI (thick-net / DB15 connector)
|
||||
2 (undefined)
|
||||
3 10base2 (thin-net == coax / BNC connector)
|
||||
4 10baseT (RJ-45 connector); force half-duplex mode
|
||||
8 transceiver type and duplex mode taken from card's EEPROM config settings
|
||||
12 10baseT (RJ-45 connector); force full-duplex mode
|
||||
|
||||
Prior to driver version 1.18c, only transceiver codes 0-4 were supported. Note
|
||||
that the new transceiver codes 8 and 12 are the *only* ones that will enable
|
||||
full-duplex mode, no matter what the card's detected EEPROM settings might be.
|
||||
This insured that merely upgrading the driver from an earlier version would
|
||||
never automatically enable full-duplex mode in an existing installation;
|
||||
it must always be explicitly enabled via one of these code in order to be
|
||||
activated.
|
||||
|
||||
The transceiver type can be changed using 'ethtool'.
|
||||
|
||||
|
||||
(4a) Interpretation of error messages and common problems
|
||||
|
||||
Error Messages
|
||||
|
||||
eth0: Infinite loop in interrupt, status 2011.
|
||||
These are "mostly harmless" message indicating that the driver had too much
|
||||
work during that interrupt cycle. With a status of 0x2011 you are receiving
|
||||
packets faster than they can be removed from the card. This should be rare
|
||||
or impossible in normal operation. Possible causes of this error report are:
|
||||
|
||||
- a "green" mode enabled that slows the processor down when there is no
|
||||
keyboard activity.
|
||||
|
||||
- some other device or device driver hogging the bus or disabling interrupts.
|
||||
Check /proc/interrupts for excessive interrupt counts. The timer tick
|
||||
interrupt should always be incrementing faster than the others.
|
||||
|
||||
No received packets
|
||||
If a 3c509, 3c562 or 3c589 can successfully transmit packets, but never
|
||||
receives packets (as reported by /proc/net/dev or 'ifconfig') you likely
|
||||
have an interrupt line problem. Check /proc/interrupts to verify that the
|
||||
card is actually generating interrupts. If the interrupt count is not
|
||||
increasing you likely have a physical conflict with two devices trying to
|
||||
use the same ISA IRQ line. The common conflict is with a sound card on IRQ10
|
||||
or IRQ5, and the easiest solution is to move the 3c509 to a different
|
||||
interrupt line. If the device is receiving packets but 'ping' doesn't work,
|
||||
you have a routing problem.
|
||||
|
||||
Tx Carrier Errors Reported in /proc/net/dev
|
||||
If an EtherLink III appears to transmit packets, but the "Tx carrier errors"
|
||||
field in /proc/net/dev increments as quickly as the Tx packet count, you
|
||||
likely have an unterminated network or the incorrect media transceiver selected.
|
||||
|
||||
3c509B card is not detected on machines with an ISA PnP BIOS.
|
||||
While the updated driver works with most PnP BIOS programs, it does not work
|
||||
with all. This can be fixed by disabling PnP support using the 3Com-supplied
|
||||
setup program.
|
||||
|
||||
3c509 card is not detected on overclocked machines
|
||||
Increase the delay time in id_read_eeprom() from the current value, 500,
|
||||
to an absurdly high value, such as 5000.
|
||||
|
||||
|
||||
(4b) Decoding Status and Error Messages
|
||||
|
||||
The bits in the main status register are:
|
||||
|
||||
value description
|
||||
0x01 Interrupt latch
|
||||
0x02 Tx overrun, or Rx underrun
|
||||
0x04 Tx complete
|
||||
0x08 Tx FIFO room available
|
||||
0x10 A complete Rx packet has arrived
|
||||
0x20 A Rx packet has started to arrive
|
||||
0x40 The driver has requested an interrupt
|
||||
0x80 Statistics counter nearly full
|
||||
|
||||
The bits in the transmit (Tx) status word are:
|
||||
|
||||
value description
|
||||
0x02 Out-of-window collision.
|
||||
0x04 Status stack overflow (normally impossible).
|
||||
0x08 16 collisions.
|
||||
0x10 Tx underrun (not enough PCI bus bandwidth).
|
||||
0x20 Tx jabber.
|
||||
0x40 Tx interrupt requested.
|
||||
0x80 Status is valid (this should always be set).
|
||||
|
||||
|
||||
When a transmit error occurs the driver produces a status message such as
|
||||
|
||||
eth0: Transmit error, Tx status register 82
|
||||
|
||||
The two values typically seen here are:
|
||||
|
||||
0x82
|
||||
Out of window collision. This typically occurs when some other Ethernet
|
||||
host is incorrectly set to full duplex on a half duplex network.
|
||||
|
||||
0x88
|
||||
16 collisions. This typically occurs when the network is exceptionally busy
|
||||
or when another host doesn't correctly back off after a collision. If this
|
||||
error is mixed with 0x82 errors it is the result of a host incorrectly set
|
||||
to full duplex (see above).
|
||||
|
||||
Both of these errors are the result of network problems that should be
|
||||
corrected. They do not represent driver malfunction.
|
||||
|
||||
|
||||
(5) Revision history (this file)
|
||||
|
||||
28Feb02 v1.0 DR New; major portions based on Becker original 3c509 docs
|
||||
|
||||
175
Documentation/networking/6pack.txt
Normal file
175
Documentation/networking/6pack.txt
Normal file
|
|
@ -0,0 +1,175 @@
|
|||
This is the 6pack-mini-HOWTO, written by
|
||||
|
||||
Andreas Könsgen DG3KQ
|
||||
Internet: ajk@comnets.uni-bremen.de
|
||||
AMPR-net: dg3kq@db0pra.ampr.org
|
||||
AX.25: dg3kq@db0ach.#nrw.deu.eu
|
||||
|
||||
Last update: April 7, 1998
|
||||
|
||||
1. What is 6pack, and what are the advantages to KISS?
|
||||
|
||||
6pack is a transmission protocol for data exchange between the PC and
|
||||
the TNC over a serial line. It can be used as an alternative to KISS.
|
||||
|
||||
6pack has two major advantages:
|
||||
- The PC is given full control over the radio
|
||||
channel. Special control data is exchanged between the PC and the TNC so
|
||||
that the PC knows at any time if the TNC is receiving data, if a TNC
|
||||
buffer underrun or overrun has occurred, if the PTT is
|
||||
set and so on. This control data is processed at a higher priority than
|
||||
normal data, so a data stream can be interrupted at any time to issue an
|
||||
important event. This helps to improve the channel access and timing
|
||||
algorithms as everything is computed in the PC. It would even be possible
|
||||
to experiment with something completely different from the known CSMA and
|
||||
DAMA channel access methods.
|
||||
This kind of real-time control is especially important to supply several
|
||||
TNCs that are connected between each other and the PC by a daisy chain
|
||||
(however, this feature is not supported yet by the Linux 6pack driver).
|
||||
|
||||
- Each packet transferred over the serial line is supplied with a checksum,
|
||||
so it is easy to detect errors due to problems on the serial line.
|
||||
Received packets that are corrupt are not passed on to the AX.25 layer.
|
||||
Damaged packets that the TNC has received from the PC are not transmitted.
|
||||
|
||||
More details about 6pack are described in the file 6pack.ps that is located
|
||||
in the doc directory of the AX.25 utilities package.
|
||||
|
||||
2. Who has developed the 6pack protocol?
|
||||
|
||||
The 6pack protocol has been developed by Ekki Plicht DF4OR, Henning Rech
|
||||
DF9IC and Gunter Jost DK7WJ. A driver for 6pack, written by Gunter Jost and
|
||||
Matthias Welwarsky DG2FEF, comes along with the PC version of FlexNet.
|
||||
They have also written a firmware for TNCs to perform the 6pack
|
||||
protocol (see section 4 below).
|
||||
|
||||
3. Where can I get the latest version of 6pack for LinuX?
|
||||
|
||||
At the moment, the 6pack stuff can obtained via anonymous ftp from
|
||||
db0bm.automation.fh-aachen.de. In the directory /incoming/dg3kq,
|
||||
there is a file named 6pack.tgz.
|
||||
|
||||
4. Preparing the TNC for 6pack operation
|
||||
|
||||
To be able to use 6pack, a special firmware for the TNC is needed. The EPROM
|
||||
of a newly bought TNC does not contain 6pack, so you will have to
|
||||
program an EPROM yourself. The image file for 6pack EPROMs should be
|
||||
available on any packet radio box where PC/FlexNet can be found. The name of
|
||||
the file is 6pack.bin. This file is copyrighted and maintained by the FlexNet
|
||||
team. It can be used under the terms of the license that comes along
|
||||
with PC/FlexNet. Please do not ask me about the internals of this file as I
|
||||
don't know anything about it. I used a textual description of the 6pack
|
||||
protocol to program the Linux driver.
|
||||
|
||||
TNCs contain a 64kByte EPROM, the lower half of which is used for
|
||||
the firmware/KISS. The upper half is either empty or is sometimes
|
||||
programmed with software called TAPR. In the latter case, the TNC
|
||||
is supplied with a DIP switch so you can easily change between the
|
||||
two systems. When programming a new EPROM, one of the systems is replaced
|
||||
by 6pack. It is useful to replace TAPR, as this software is rarely used
|
||||
nowadays. If your TNC is not equipped with the switch mentioned above, you
|
||||
can build in one yourself that switches over the highest address pin
|
||||
of the EPROM between HIGH and LOW level. After having inserted the new EPROM
|
||||
and switched to 6pack, apply power to the TNC for a first test. The connect
|
||||
and the status LED are lit for about a second if the firmware initialises
|
||||
the TNC correctly.
|
||||
|
||||
5. Building and installing the 6pack driver
|
||||
|
||||
The driver has been tested with kernel version 2.1.90. Use with older
|
||||
kernels may lead to a compilation error because the interface to a kernel
|
||||
function has been changed in the 2.1.8x kernels.
|
||||
|
||||
How to turn on 6pack support:
|
||||
|
||||
- In the linux kernel configuration program, select the code maturity level
|
||||
options menu and turn on the prompting for development drivers.
|
||||
|
||||
- Select the amateur radio support menu and turn on the serial port 6pack
|
||||
driver.
|
||||
|
||||
- Compile and install the kernel and the modules.
|
||||
|
||||
To use the driver, the kissattach program delivered with the AX.25 utilities
|
||||
has to be modified.
|
||||
|
||||
- Do a cd to the directory that holds the kissattach sources. Edit the
|
||||
kissattach.c file. At the top, insert the following lines:
|
||||
|
||||
#ifndef N_6PACK
|
||||
#define N_6PACK (N_AX25+1)
|
||||
#endif
|
||||
|
||||
Then find the line
|
||||
|
||||
int disc = N_AX25;
|
||||
|
||||
and replace N_AX25 by N_6PACK.
|
||||
|
||||
- Recompile kissattach. Rename it to spattach to avoid confusions.
|
||||
|
||||
Installing the driver:
|
||||
|
||||
- Do an insmod 6pack. Look at your /var/log/messages file to check if the
|
||||
module has printed its initialization message.
|
||||
|
||||
- Do a spattach as you would launch kissattach when starting a KISS port.
|
||||
Check if the kernel prints the message '6pack: TNC found'.
|
||||
|
||||
- From here, everything should work as if you were setting up a KISS port.
|
||||
The only difference is that the network device that represents
|
||||
the 6pack port is called sp instead of sl or ax. So, sp0 would be the
|
||||
first 6pack port.
|
||||
|
||||
Although the driver has been tested on various platforms, I still declare it
|
||||
ALPHA. BE CAREFUL! Sync your disks before insmoding the 6pack module
|
||||
and spattaching. Watch out if your computer behaves strangely. Read section
|
||||
6 of this file about known problems.
|
||||
|
||||
Note that the connect and status LEDs of the TNC are controlled in a
|
||||
different way than they are when the TNC is used with PC/FlexNet. When using
|
||||
FlexNet, the connect LED is on if there is a connection; the status LED is
|
||||
on if there is data in the buffer of the PC's AX.25 engine that has to be
|
||||
transmitted. Under Linux, the 6pack layer is beyond the AX.25 layer,
|
||||
so the 6pack driver doesn't know anything about connects or data that
|
||||
has not yet been transmitted. Therefore the LEDs are controlled
|
||||
as they are in KISS mode: The connect LED is turned on if data is transferred
|
||||
from the PC to the TNC over the serial line, the status LED if data is
|
||||
sent to the PC.
|
||||
|
||||
6. Known problems
|
||||
|
||||
When testing the driver with 2.0.3x kernels and
|
||||
operating with data rates on the radio channel of 9600 Baud or higher,
|
||||
the driver may, on certain systems, sometimes print the message '6pack:
|
||||
bad checksum', which is due to data loss if the other station sends two
|
||||
or more subsequent packets. I have been told that this is due to a problem
|
||||
with the serial driver of 2.0.3x kernels. I don't know yet if the problem
|
||||
still exists with 2.1.x kernels, as I have heard that the serial driver
|
||||
code has been changed with 2.1.x.
|
||||
|
||||
When shutting down the sp interface with ifconfig, the kernel crashes if
|
||||
there is still an AX.25 connection left over which an IP connection was
|
||||
running, even if that IP connection is already closed. The problem does not
|
||||
occur when there is a bare AX.25 connection still running. I don't know if
|
||||
this is a problem of the 6pack driver or something else in the kernel.
|
||||
|
||||
The driver has been tested as a module, not yet as a kernel-builtin driver.
|
||||
|
||||
The 6pack protocol supports daisy-chaining of TNCs in a token ring, which is
|
||||
connected to one serial port of the PC. This feature is not implemented
|
||||
and at least at the moment I won't be able to do it because I do not have
|
||||
the opportunity to build a TNC daisy-chain and test it.
|
||||
|
||||
Some of the comments in the source code are inaccurate. They are left from
|
||||
the SLIP/KISS driver, from which the 6pack driver has been derived.
|
||||
I haven't modified or removed them yet -- sorry! The code itself needs
|
||||
some cleaning and optimizing. This will be done in a later release.
|
||||
|
||||
If you encounter a bug or if you have a question or suggestion concerning the
|
||||
driver, feel free to mail me, using the addresses given at the beginning of
|
||||
this file.
|
||||
|
||||
Have fun!
|
||||
|
||||
Andreas
|
||||
46
Documentation/networking/LICENSE.qla3xxx
Normal file
46
Documentation/networking/LICENSE.qla3xxx
Normal file
|
|
@ -0,0 +1,46 @@
|
|||
Copyright (c) 2003-2006 QLogic Corporation
|
||||
QLogic Linux Networking HBA Driver
|
||||
|
||||
This program includes a device driver for Linux 2.6 that may be
|
||||
distributed with QLogic hardware specific firmware binary file.
|
||||
You may modify and redistribute the device driver code under the
|
||||
GNU General Public License as published by the Free Software
|
||||
Foundation (version 2 or a later version).
|
||||
|
||||
You may redistribute the hardware specific firmware binary file
|
||||
under the following terms:
|
||||
|
||||
1. Redistribution of source code (only if applicable),
|
||||
must retain the above copyright notice, this list of
|
||||
conditions and the following disclaimer.
|
||||
|
||||
2. Redistribution in binary form must reproduce the above
|
||||
copyright notice, this list of conditions and the
|
||||
following disclaimer in the documentation and/or other
|
||||
materials provided with the distribution.
|
||||
|
||||
3. The name of QLogic Corporation may not be used to
|
||||
endorse or promote products derived from this software
|
||||
without specific prior written permission
|
||||
|
||||
REGARDLESS OF WHAT LICENSING MECHANISM IS USED OR APPLICABLE,
|
||||
THIS PROGRAM IS PROVIDED BY QLOGIC CORPORATION "AS IS'' AND ANY
|
||||
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
|
||||
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR
|
||||
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
|
||||
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
|
||||
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
||||
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
USER ACKNOWLEDGES AND AGREES THAT USE OF THIS PROGRAM WILL NOT
|
||||
CREATE OR GIVE GROUNDS FOR A LICENSE BY IMPLICATION, ESTOPPEL, OR
|
||||
OTHERWISE IN ANY INTELLECTUAL PROPERTY RIGHTS (PATENT, COPYRIGHT,
|
||||
TRADE SECRET, MASK WORK, OR OTHER PROPRIETARY RIGHT) EMBODIED IN
|
||||
ANY OTHER QLOGIC HARDWARE OR SOFTWARE EITHER SOLELY OR IN
|
||||
COMBINATION WITH THIS PROGRAM.
|
||||
|
||||
288
Documentation/networking/LICENSE.qlcnic
Normal file
288
Documentation/networking/LICENSE.qlcnic
Normal file
|
|
@ -0,0 +1,288 @@
|
|||
Copyright (c) 2009-2013 QLogic Corporation
|
||||
QLogic Linux qlcnic NIC Driver
|
||||
|
||||
You may modify and redistribute the device driver code under the
|
||||
GNU General Public License (a copy of which is attached hereto as
|
||||
Exhibit A) published by the Free Software Foundation (version 2).
|
||||
|
||||
|
||||
EXHIBIT A
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 2, June 1991
|
||||
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
|
||||
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
Preamble
|
||||
|
||||
The licenses for most software are designed to take away your
|
||||
freedom to share and change it. By contrast, the GNU General Public
|
||||
License is intended to guarantee your freedom to share and change free
|
||||
software--to make sure the software is free for all its users. This
|
||||
General Public License applies to most of the Free Software
|
||||
Foundation's software and to any other program whose authors commit to
|
||||
using it. (Some other Free Software Foundation software is covered by
|
||||
the GNU Lesser General Public License instead.) You can apply it to
|
||||
your programs, too.
|
||||
|
||||
When we speak of free software, we are referring to freedom, not
|
||||
price. Our General Public Licenses are designed to make sure that you
|
||||
have the freedom to distribute copies of free software (and charge for
|
||||
this service if you wish), that you receive source code or can get it
|
||||
if you want it, that you can change the software or use pieces of it
|
||||
in new free programs; and that you know you can do these things.
|
||||
|
||||
To protect your rights, we need to make restrictions that forbid
|
||||
anyone to deny you these rights or to ask you to surrender the rights.
|
||||
These restrictions translate to certain responsibilities for you if you
|
||||
distribute copies of the software, or if you modify it.
|
||||
|
||||
For example, if you distribute copies of such a program, whether
|
||||
gratis or for a fee, you must give the recipients all the rights that
|
||||
you have. You must make sure that they, too, receive or can get the
|
||||
source code. And you must show them these terms so they know their
|
||||
rights.
|
||||
|
||||
We protect your rights with two steps: (1) copyright the software, and
|
||||
(2) offer you this license which gives you legal permission to copy,
|
||||
distribute and/or modify the software.
|
||||
|
||||
Also, for each author's protection and ours, we want to make certain
|
||||
that everyone understands that there is no warranty for this free
|
||||
software. If the software is modified by someone else and passed on, we
|
||||
want its recipients to know that what they have is not the original, so
|
||||
that any problems introduced by others will not reflect on the original
|
||||
authors' reputations.
|
||||
|
||||
Finally, any free program is threatened constantly by software
|
||||
patents. We wish to avoid the danger that redistributors of a free
|
||||
program will individually obtain patent licenses, in effect making the
|
||||
program proprietary. To prevent this, we have made it clear that any
|
||||
patent must be licensed for everyone's free use or not licensed at all.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow.
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||
|
||||
0. This License applies to any program or other work which contains
|
||||
a notice placed by the copyright holder saying it may be distributed
|
||||
under the terms of this General Public License. The "Program", below,
|
||||
refers to any such program or work, and a "work based on the Program"
|
||||
means either the Program or any derivative work under copyright law:
|
||||
that is to say, a work containing the Program or a portion of it,
|
||||
either verbatim or with modifications and/or translated into another
|
||||
language. (Hereinafter, translation is included without limitation in
|
||||
the term "modification".) Each licensee is addressed as "you".
|
||||
|
||||
Activities other than copying, distribution and modification are not
|
||||
covered by this License; they are outside its scope. The act of
|
||||
running the Program is not restricted, and the output from the Program
|
||||
is covered only if its contents constitute a work based on the
|
||||
Program (independent of having been made by running the Program).
|
||||
Whether that is true depends on what the Program does.
|
||||
|
||||
1. You may copy and distribute verbatim copies of the Program's
|
||||
source code as you receive it, in any medium, provided that you
|
||||
conspicuously and appropriately publish on each copy an appropriate
|
||||
copyright notice and disclaimer of warranty; keep intact all the
|
||||
notices that refer to this License and to the absence of any warranty;
|
||||
and give any other recipients of the Program a copy of this License
|
||||
along with the Program.
|
||||
|
||||
You may charge a fee for the physical act of transferring a copy, and
|
||||
you may at your option offer warranty protection in exchange for a fee.
|
||||
|
||||
2. You may modify your copy or copies of the Program or any portion
|
||||
of it, thus forming a work based on the Program, and copy and
|
||||
distribute such modifications or work under the terms of Section 1
|
||||
above, provided that you also meet all of these conditions:
|
||||
|
||||
a) You must cause the modified files to carry prominent notices
|
||||
stating that you changed the files and the date of any change.
|
||||
|
||||
b) You must cause any work that you distribute or publish, that in
|
||||
whole or in part contains or is derived from the Program or any
|
||||
part thereof, to be licensed as a whole at no charge to all third
|
||||
parties under the terms of this License.
|
||||
|
||||
c) If the modified program normally reads commands interactively
|
||||
when run, you must cause it, when started running for such
|
||||
interactive use in the most ordinary way, to print or display an
|
||||
announcement including an appropriate copyright notice and a
|
||||
notice that there is no warranty (or else, saying that you provide
|
||||
a warranty) and that users may redistribute the program under
|
||||
these conditions, and telling the user how to view a copy of this
|
||||
License. (Exception: if the Program itself is interactive but
|
||||
does not normally print such an announcement, your work based on
|
||||
the Program is not required to print an announcement.)
|
||||
|
||||
These requirements apply to the modified work as a whole. If
|
||||
identifiable sections of that work are not derived from the Program,
|
||||
and can be reasonably considered independent and separate works in
|
||||
themselves, then this License, and its terms, do not apply to those
|
||||
sections when you distribute them as separate works. But when you
|
||||
distribute the same sections as part of a whole which is a work based
|
||||
on the Program, the distribution of the whole must be on the terms of
|
||||
this License, whose permissions for other licensees extend to the
|
||||
entire whole, and thus to each and every part regardless of who wrote it.
|
||||
|
||||
Thus, it is not the intent of this section to claim rights or contest
|
||||
your rights to work written entirely by you; rather, the intent is to
|
||||
exercise the right to control the distribution of derivative or
|
||||
collective works based on the Program.
|
||||
|
||||
In addition, mere aggregation of another work not based on the Program
|
||||
with the Program (or with a work based on the Program) on a volume of
|
||||
a storage or distribution medium does not bring the other work under
|
||||
the scope of this License.
|
||||
|
||||
3. You may copy and distribute the Program (or a work based on it,
|
||||
under Section 2) in object code or executable form under the terms of
|
||||
Sections 1 and 2 above provided that you also do one of the following:
|
||||
|
||||
a) Accompany it with the complete corresponding machine-readable
|
||||
source code, which must be distributed under the terms of Sections
|
||||
1 and 2 above on a medium customarily used for software interchange; or,
|
||||
|
||||
b) Accompany it with a written offer, valid for at least three
|
||||
years, to give any third party, for a charge no more than your
|
||||
cost of physically performing source distribution, a complete
|
||||
machine-readable copy of the corresponding source code, to be
|
||||
distributed under the terms of Sections 1 and 2 above on a medium
|
||||
customarily used for software interchange; or,
|
||||
|
||||
c) Accompany it with the information you received as to the offer
|
||||
to distribute corresponding source code. (This alternative is
|
||||
allowed only for noncommercial distribution and only if you
|
||||
received the program in object code or executable form with such
|
||||
an offer, in accord with Subsection b above.)
|
||||
|
||||
The source code for a work means the preferred form of the work for
|
||||
making modifications to it. For an executable work, complete source
|
||||
code means all the source code for all modules it contains, plus any
|
||||
associated interface definition files, plus the scripts used to
|
||||
control compilation and installation of the executable. However, as a
|
||||
special exception, the source code distributed need not include
|
||||
anything that is normally distributed (in either source or binary
|
||||
form) with the major components (compiler, kernel, and so on) of the
|
||||
operating system on which the executable runs, unless that component
|
||||
itself accompanies the executable.
|
||||
|
||||
If distribution of executable or object code is made by offering
|
||||
access to copy from a designated place, then offering equivalent
|
||||
access to copy the source code from the same place counts as
|
||||
distribution of the source code, even though third parties are not
|
||||
compelled to copy the source along with the object code.
|
||||
|
||||
4. You may not copy, modify, sublicense, or distribute the Program
|
||||
except as expressly provided under this License. Any attempt
|
||||
otherwise to copy, modify, sublicense or distribute the Program is
|
||||
void, and will automatically terminate your rights under this License.
|
||||
However, parties who have received copies, or rights, from you under
|
||||
this License will not have their licenses terminated so long as such
|
||||
parties remain in full compliance.
|
||||
|
||||
5. You are not required to accept this License, since you have not
|
||||
signed it. However, nothing else grants you permission to modify or
|
||||
distribute the Program or its derivative works. These actions are
|
||||
prohibited by law if you do not accept this License. Therefore, by
|
||||
modifying or distributing the Program (or any work based on the
|
||||
Program), you indicate your acceptance of this License to do so, and
|
||||
all its terms and conditions for copying, distributing or modifying
|
||||
the Program or works based on it.
|
||||
|
||||
6. Each time you redistribute the Program (or any work based on the
|
||||
Program), the recipient automatically receives a license from the
|
||||
original licensor to copy, distribute or modify the Program subject to
|
||||
these terms and conditions. You may not impose any further
|
||||
restrictions on the recipients' exercise of the rights granted herein.
|
||||
You are not responsible for enforcing compliance by third parties to
|
||||
this License.
|
||||
|
||||
7. If, as a consequence of a court judgment or allegation of patent
|
||||
infringement or for any other reason (not limited to patent issues),
|
||||
conditions are imposed on you (whether by court order, agreement or
|
||||
otherwise) that contradict the conditions of this License, they do not
|
||||
excuse you from the conditions of this License. If you cannot
|
||||
distribute so as to satisfy simultaneously your obligations under this
|
||||
License and any other pertinent obligations, then as a consequence you
|
||||
may not distribute the Program at all. For example, if a patent
|
||||
license would not permit royalty-free redistribution of the Program by
|
||||
all those who receive copies directly or indirectly through you, then
|
||||
the only way you could satisfy both it and this License would be to
|
||||
refrain entirely from distribution of the Program.
|
||||
|
||||
If any portion of this section is held invalid or unenforceable under
|
||||
any particular circumstance, the balance of the section is intended to
|
||||
apply and the section as a whole is intended to apply in other
|
||||
circumstances.
|
||||
|
||||
It is not the purpose of this section to induce you to infringe any
|
||||
patents or other property right claims or to contest validity of any
|
||||
such claims; this section has the sole purpose of protecting the
|
||||
integrity of the free software distribution system, which is
|
||||
implemented by public license practices. Many people have made
|
||||
generous contributions to the wide range of software distributed
|
||||
through that system in reliance on consistent application of that
|
||||
system; it is up to the author/donor to decide if he or she is willing
|
||||
to distribute software through any other system and a licensee cannot
|
||||
impose that choice.
|
||||
|
||||
This section is intended to make thoroughly clear what is believed to
|
||||
be a consequence of the rest of this License.
|
||||
|
||||
8. If the distribution and/or use of the Program is restricted in
|
||||
certain countries either by patents or by copyrighted interfaces, the
|
||||
original copyright holder who places the Program under this License
|
||||
may add an explicit geographical distribution limitation excluding
|
||||
those countries, so that distribution is permitted only in or among
|
||||
countries not thus excluded. In such case, this License incorporates
|
||||
the limitation as if written in the body of this License.
|
||||
|
||||
9. The Free Software Foundation may publish revised and/or new versions
|
||||
of the General Public License from time to time. Such new versions will
|
||||
be similar in spirit to the present version, but may differ in detail to
|
||||
address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the Program
|
||||
specifies a version number of this License which applies to it and "any
|
||||
later version", you have the option of following the terms and conditions
|
||||
either of that version or of any later version published by the Free
|
||||
Software Foundation. If the Program does not specify a version number of
|
||||
this License, you may choose any version ever published by the Free Software
|
||||
Foundation.
|
||||
|
||||
10. If you wish to incorporate parts of the Program into other free
|
||||
programs whose distribution conditions are different, write to the author
|
||||
to ask for permission. For software which is copyrighted by the Free
|
||||
Software Foundation, write to the Free Software Foundation; we sometimes
|
||||
make exceptions for this. Our decision will be guided by the two goals
|
||||
of preserving the free status of all derivatives of our free software and
|
||||
of promoting the sharing and reuse of software generally.
|
||||
|
||||
NO WARRANTY
|
||||
|
||||
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
|
||||
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
|
||||
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
|
||||
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
|
||||
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
||||
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
|
||||
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
|
||||
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
|
||||
REPAIR OR CORRECTION.
|
||||
|
||||
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
|
||||
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
|
||||
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
|
||||
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
|
||||
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
|
||||
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
|
||||
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGES.
|
||||
288
Documentation/networking/LICENSE.qlge
Normal file
288
Documentation/networking/LICENSE.qlge
Normal file
|
|
@ -0,0 +1,288 @@
|
|||
Copyright (c) 2003-2011 QLogic Corporation
|
||||
QLogic Linux qlge NIC Driver
|
||||
|
||||
You may modify and redistribute the device driver code under the
|
||||
GNU General Public License (a copy of which is attached hereto as
|
||||
Exhibit A) published by the Free Software Foundation (version 2).
|
||||
|
||||
|
||||
EXHIBIT A
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 2, June 1991
|
||||
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
|
||||
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
Preamble
|
||||
|
||||
The licenses for most software are designed to take away your
|
||||
freedom to share and change it. By contrast, the GNU General Public
|
||||
License is intended to guarantee your freedom to share and change free
|
||||
software--to make sure the software is free for all its users. This
|
||||
General Public License applies to most of the Free Software
|
||||
Foundation's software and to any other program whose authors commit to
|
||||
using it. (Some other Free Software Foundation software is covered by
|
||||
the GNU Lesser General Public License instead.) You can apply it to
|
||||
your programs, too.
|
||||
|
||||
When we speak of free software, we are referring to freedom, not
|
||||
price. Our General Public Licenses are designed to make sure that you
|
||||
have the freedom to distribute copies of free software (and charge for
|
||||
this service if you wish), that you receive source code or can get it
|
||||
if you want it, that you can change the software or use pieces of it
|
||||
in new free programs; and that you know you can do these things.
|
||||
|
||||
To protect your rights, we need to make restrictions that forbid
|
||||
anyone to deny you these rights or to ask you to surrender the rights.
|
||||
These restrictions translate to certain responsibilities for you if you
|
||||
distribute copies of the software, or if you modify it.
|
||||
|
||||
For example, if you distribute copies of such a program, whether
|
||||
gratis or for a fee, you must give the recipients all the rights that
|
||||
you have. You must make sure that they, too, receive or can get the
|
||||
source code. And you must show them these terms so they know their
|
||||
rights.
|
||||
|
||||
We protect your rights with two steps: (1) copyright the software, and
|
||||
(2) offer you this license which gives you legal permission to copy,
|
||||
distribute and/or modify the software.
|
||||
|
||||
Also, for each author's protection and ours, we want to make certain
|
||||
that everyone understands that there is no warranty for this free
|
||||
software. If the software is modified by someone else and passed on, we
|
||||
want its recipients to know that what they have is not the original, so
|
||||
that any problems introduced by others will not reflect on the original
|
||||
authors' reputations.
|
||||
|
||||
Finally, any free program is threatened constantly by software
|
||||
patents. We wish to avoid the danger that redistributors of a free
|
||||
program will individually obtain patent licenses, in effect making the
|
||||
program proprietary. To prevent this, we have made it clear that any
|
||||
patent must be licensed for everyone's free use or not licensed at all.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow.
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||
|
||||
0. This License applies to any program or other work which contains
|
||||
a notice placed by the copyright holder saying it may be distributed
|
||||
under the terms of this General Public License. The "Program", below,
|
||||
refers to any such program or work, and a "work based on the Program"
|
||||
means either the Program or any derivative work under copyright law:
|
||||
that is to say, a work containing the Program or a portion of it,
|
||||
either verbatim or with modifications and/or translated into another
|
||||
language. (Hereinafter, translation is included without limitation in
|
||||
the term "modification".) Each licensee is addressed as "you".
|
||||
|
||||
Activities other than copying, distribution and modification are not
|
||||
covered by this License; they are outside its scope. The act of
|
||||
running the Program is not restricted, and the output from the Program
|
||||
is covered only if its contents constitute a work based on the
|
||||
Program (independent of having been made by running the Program).
|
||||
Whether that is true depends on what the Program does.
|
||||
|
||||
1. You may copy and distribute verbatim copies of the Program's
|
||||
source code as you receive it, in any medium, provided that you
|
||||
conspicuously and appropriately publish on each copy an appropriate
|
||||
copyright notice and disclaimer of warranty; keep intact all the
|
||||
notices that refer to this License and to the absence of any warranty;
|
||||
and give any other recipients of the Program a copy of this License
|
||||
along with the Program.
|
||||
|
||||
You may charge a fee for the physical act of transferring a copy, and
|
||||
you may at your option offer warranty protection in exchange for a fee.
|
||||
|
||||
2. You may modify your copy or copies of the Program or any portion
|
||||
of it, thus forming a work based on the Program, and copy and
|
||||
distribute such modifications or work under the terms of Section 1
|
||||
above, provided that you also meet all of these conditions:
|
||||
|
||||
a) You must cause the modified files to carry prominent notices
|
||||
stating that you changed the files and the date of any change.
|
||||
|
||||
b) You must cause any work that you distribute or publish, that in
|
||||
whole or in part contains or is derived from the Program or any
|
||||
part thereof, to be licensed as a whole at no charge to all third
|
||||
parties under the terms of this License.
|
||||
|
||||
c) If the modified program normally reads commands interactively
|
||||
when run, you must cause it, when started running for such
|
||||
interactive use in the most ordinary way, to print or display an
|
||||
announcement including an appropriate copyright notice and a
|
||||
notice that there is no warranty (or else, saying that you provide
|
||||
a warranty) and that users may redistribute the program under
|
||||
these conditions, and telling the user how to view a copy of this
|
||||
License. (Exception: if the Program itself is interactive but
|
||||
does not normally print such an announcement, your work based on
|
||||
the Program is not required to print an announcement.)
|
||||
|
||||
These requirements apply to the modified work as a whole. If
|
||||
identifiable sections of that work are not derived from the Program,
|
||||
and can be reasonably considered independent and separate works in
|
||||
themselves, then this License, and its terms, do not apply to those
|
||||
sections when you distribute them as separate works. But when you
|
||||
distribute the same sections as part of a whole which is a work based
|
||||
on the Program, the distribution of the whole must be on the terms of
|
||||
this License, whose permissions for other licensees extend to the
|
||||
entire whole, and thus to each and every part regardless of who wrote it.
|
||||
|
||||
Thus, it is not the intent of this section to claim rights or contest
|
||||
your rights to work written entirely by you; rather, the intent is to
|
||||
exercise the right to control the distribution of derivative or
|
||||
collective works based on the Program.
|
||||
|
||||
In addition, mere aggregation of another work not based on the Program
|
||||
with the Program (or with a work based on the Program) on a volume of
|
||||
a storage or distribution medium does not bring the other work under
|
||||
the scope of this License.
|
||||
|
||||
3. You may copy and distribute the Program (or a work based on it,
|
||||
under Section 2) in object code or executable form under the terms of
|
||||
Sections 1 and 2 above provided that you also do one of the following:
|
||||
|
||||
a) Accompany it with the complete corresponding machine-readable
|
||||
source code, which must be distributed under the terms of Sections
|
||||
1 and 2 above on a medium customarily used for software interchange; or,
|
||||
|
||||
b) Accompany it with a written offer, valid for at least three
|
||||
years, to give any third party, for a charge no more than your
|
||||
cost of physically performing source distribution, a complete
|
||||
machine-readable copy of the corresponding source code, to be
|
||||
distributed under the terms of Sections 1 and 2 above on a medium
|
||||
customarily used for software interchange; or,
|
||||
|
||||
c) Accompany it with the information you received as to the offer
|
||||
to distribute corresponding source code. (This alternative is
|
||||
allowed only for noncommercial distribution and only if you
|
||||
received the program in object code or executable form with such
|
||||
an offer, in accord with Subsection b above.)
|
||||
|
||||
The source code for a work means the preferred form of the work for
|
||||
making modifications to it. For an executable work, complete source
|
||||
code means all the source code for all modules it contains, plus any
|
||||
associated interface definition files, plus the scripts used to
|
||||
control compilation and installation of the executable. However, as a
|
||||
special exception, the source code distributed need not include
|
||||
anything that is normally distributed (in either source or binary
|
||||
form) with the major components (compiler, kernel, and so on) of the
|
||||
operating system on which the executable runs, unless that component
|
||||
itself accompanies the executable.
|
||||
|
||||
If distribution of executable or object code is made by offering
|
||||
access to copy from a designated place, then offering equivalent
|
||||
access to copy the source code from the same place counts as
|
||||
distribution of the source code, even though third parties are not
|
||||
compelled to copy the source along with the object code.
|
||||
|
||||
4. You may not copy, modify, sublicense, or distribute the Program
|
||||
except as expressly provided under this License. Any attempt
|
||||
otherwise to copy, modify, sublicense or distribute the Program is
|
||||
void, and will automatically terminate your rights under this License.
|
||||
However, parties who have received copies, or rights, from you under
|
||||
this License will not have their licenses terminated so long as such
|
||||
parties remain in full compliance.
|
||||
|
||||
5. You are not required to accept this License, since you have not
|
||||
signed it. However, nothing else grants you permission to modify or
|
||||
distribute the Program or its derivative works. These actions are
|
||||
prohibited by law if you do not accept this License. Therefore, by
|
||||
modifying or distributing the Program (or any work based on the
|
||||
Program), you indicate your acceptance of this License to do so, and
|
||||
all its terms and conditions for copying, distributing or modifying
|
||||
the Program or works based on it.
|
||||
|
||||
6. Each time you redistribute the Program (or any work based on the
|
||||
Program), the recipient automatically receives a license from the
|
||||
original licensor to copy, distribute or modify the Program subject to
|
||||
these terms and conditions. You may not impose any further
|
||||
restrictions on the recipients' exercise of the rights granted herein.
|
||||
You are not responsible for enforcing compliance by third parties to
|
||||
this License.
|
||||
|
||||
7. If, as a consequence of a court judgment or allegation of patent
|
||||
infringement or for any other reason (not limited to patent issues),
|
||||
conditions are imposed on you (whether by court order, agreement or
|
||||
otherwise) that contradict the conditions of this License, they do not
|
||||
excuse you from the conditions of this License. If you cannot
|
||||
distribute so as to satisfy simultaneously your obligations under this
|
||||
License and any other pertinent obligations, then as a consequence you
|
||||
may not distribute the Program at all. For example, if a patent
|
||||
license would not permit royalty-free redistribution of the Program by
|
||||
all those who receive copies directly or indirectly through you, then
|
||||
the only way you could satisfy both it and this License would be to
|
||||
refrain entirely from distribution of the Program.
|
||||
|
||||
If any portion of this section is held invalid or unenforceable under
|
||||
any particular circumstance, the balance of the section is intended to
|
||||
apply and the section as a whole is intended to apply in other
|
||||
circumstances.
|
||||
|
||||
It is not the purpose of this section to induce you to infringe any
|
||||
patents or other property right claims or to contest validity of any
|
||||
such claims; this section has the sole purpose of protecting the
|
||||
integrity of the free software distribution system, which is
|
||||
implemented by public license practices. Many people have made
|
||||
generous contributions to the wide range of software distributed
|
||||
through that system in reliance on consistent application of that
|
||||
system; it is up to the author/donor to decide if he or she is willing
|
||||
to distribute software through any other system and a licensee cannot
|
||||
impose that choice.
|
||||
|
||||
This section is intended to make thoroughly clear what is believed to
|
||||
be a consequence of the rest of this License.
|
||||
|
||||
8. If the distribution and/or use of the Program is restricted in
|
||||
certain countries either by patents or by copyrighted interfaces, the
|
||||
original copyright holder who places the Program under this License
|
||||
may add an explicit geographical distribution limitation excluding
|
||||
those countries, so that distribution is permitted only in or among
|
||||
countries not thus excluded. In such case, this License incorporates
|
||||
the limitation as if written in the body of this License.
|
||||
|
||||
9. The Free Software Foundation may publish revised and/or new versions
|
||||
of the General Public License from time to time. Such new versions will
|
||||
be similar in spirit to the present version, but may differ in detail to
|
||||
address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the Program
|
||||
specifies a version number of this License which applies to it and "any
|
||||
later version", you have the option of following the terms and conditions
|
||||
either of that version or of any later version published by the Free
|
||||
Software Foundation. If the Program does not specify a version number of
|
||||
this License, you may choose any version ever published by the Free Software
|
||||
Foundation.
|
||||
|
||||
10. If you wish to incorporate parts of the Program into other free
|
||||
programs whose distribution conditions are different, write to the author
|
||||
to ask for permission. For software which is copyrighted by the Free
|
||||
Software Foundation, write to the Free Software Foundation; we sometimes
|
||||
make exceptions for this. Our decision will be guided by the two goals
|
||||
of preserving the free status of all derivatives of our free software and
|
||||
of promoting the sharing and reuse of software generally.
|
||||
|
||||
NO WARRANTY
|
||||
|
||||
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
|
||||
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
|
||||
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
|
||||
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
|
||||
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
||||
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
|
||||
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
|
||||
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
|
||||
REPAIR OR CORRECTION.
|
||||
|
||||
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
|
||||
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
|
||||
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
|
||||
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
|
||||
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
|
||||
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
|
||||
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGES.
|
||||
1
Documentation/networking/Makefile
Normal file
1
Documentation/networking/Makefile
Normal file
|
|
@ -0,0 +1 @@
|
|||
subdir-y := timestamping
|
||||
215
Documentation/networking/PLIP.txt
Normal file
215
Documentation/networking/PLIP.txt
Normal file
|
|
@ -0,0 +1,215 @@
|
|||
PLIP: The Parallel Line Internet Protocol Device
|
||||
|
||||
Donald Becker (becker@super.org)
|
||||
I.D.A. Supercomputing Research Center, Bowie MD 20715
|
||||
|
||||
At some point T. Thorn will probably contribute text,
|
||||
Tommy Thorn (tthorn@daimi.aau.dk)
|
||||
|
||||
PLIP Introduction
|
||||
-----------------
|
||||
|
||||
This document describes the parallel port packet pusher for Net/LGX.
|
||||
This device interface allows a point-to-point connection between two
|
||||
parallel ports to appear as a IP network interface.
|
||||
|
||||
What is PLIP?
|
||||
=============
|
||||
|
||||
PLIP is Parallel Line IP, that is, the transportation of IP packages
|
||||
over a parallel port. In the case of a PC, the obvious choice is the
|
||||
printer port. PLIP is a non-standard, but [can use] uses the standard
|
||||
LapLink null-printer cable [can also work in turbo mode, with a PLIP
|
||||
cable]. [The protocol used to pack IP packages, is a simple one
|
||||
initiated by Crynwr.]
|
||||
|
||||
Advantages of PLIP
|
||||
==================
|
||||
|
||||
It's cheap, it's available everywhere, and it's easy.
|
||||
|
||||
The PLIP cable is all that's needed to connect two Linux boxes, and it
|
||||
can be built for very few bucks.
|
||||
|
||||
Connecting two Linux boxes takes only a second's decision and a few
|
||||
minutes' work, no need to search for a [supported] netcard. This might
|
||||
even be especially important in the case of notebooks, where netcards
|
||||
are not easily available.
|
||||
|
||||
Not requiring a netcard also means that apart from connecting the
|
||||
cables, everything else is software configuration [which in principle
|
||||
could be made very easy.]
|
||||
|
||||
Disadvantages of PLIP
|
||||
=====================
|
||||
|
||||
Doesn't work over a modem, like SLIP and PPP. Limited range, 15 m.
|
||||
Can only be used to connect three (?) Linux boxes. Doesn't connect to
|
||||
an existing Ethernet. Isn't standard (not even de facto standard, like
|
||||
SLIP).
|
||||
|
||||
Performance
|
||||
===========
|
||||
|
||||
PLIP easily outperforms Ethernet cards....(ups, I was dreaming, but
|
||||
it *is* getting late. EOB)
|
||||
|
||||
PLIP driver details
|
||||
-------------------
|
||||
|
||||
The Linux PLIP driver is an implementation of the original Crynwr protocol,
|
||||
that uses the parallel port subsystem of the kernel in order to properly
|
||||
share parallel ports between PLIP and other services.
|
||||
|
||||
IRQs and trigger timeouts
|
||||
=========================
|
||||
|
||||
When a parallel port used for a PLIP driver has an IRQ configured to it, the
|
||||
PLIP driver is signaled whenever data is sent to it via the cable, such that
|
||||
when no data is available, the driver isn't being used.
|
||||
|
||||
However, on some machines it is hard, if not impossible, to configure an IRQ
|
||||
to a certain parallel port, mainly because it is used by some other device.
|
||||
On these machines, the PLIP driver can be used in IRQ-less mode, where
|
||||
the PLIP driver would constantly poll the parallel port for data waiting,
|
||||
and if such data is available, process it. This mode is less efficient than
|
||||
the IRQ mode, because the driver has to check the parallel port many times
|
||||
per second, even when no data at all is sent. Some rough measurements
|
||||
indicate that there isn't a noticeable performance drop when using IRQ-less
|
||||
mode as compared to IRQ mode as far as the data transfer speed is involved.
|
||||
There is a performance drop on the machine hosting the driver.
|
||||
|
||||
When the PLIP driver is used in IRQ mode, the timeout used for triggering a
|
||||
data transfer (the maximal time the PLIP driver would allow the other side
|
||||
before announcing a timeout, when trying to handshake a transfer of some
|
||||
data) is, by default, 500usec. As IRQ delivery is more or less immediate,
|
||||
this timeout is quite sufficient.
|
||||
|
||||
When in IRQ-less mode, the PLIP driver polls the parallel port HZ times
|
||||
per second (where HZ is typically 100 on most platforms, and 1024 on an
|
||||
Alpha, as of this writing). Between two such polls, there are 10^6/HZ usecs.
|
||||
On an i386, for example, 10^6/100 = 10000usec. It is easy to see that it is
|
||||
quite possible for the trigger timeout to expire between two such polls, as
|
||||
the timeout is only 500usec long. As a result, it is required to change the
|
||||
trigger timeout on the *other* side of a PLIP connection, to about
|
||||
10^6/HZ usecs. If both sides of a PLIP connection are used in IRQ-less mode,
|
||||
this timeout is required on both sides.
|
||||
|
||||
It appears that in practice, the trigger timeout can be shorter than in the
|
||||
above calculation. It isn't an important issue, unless the wire is faulty,
|
||||
in which case a long timeout would stall the machine when, for whatever
|
||||
reason, bits are dropped.
|
||||
|
||||
A utility that can perform this change in Linux is plipconfig, which is part
|
||||
of the net-tools package (its location can be found in the
|
||||
Documentation/Changes file). An example command would be
|
||||
'plipconfig plipX trigger 10000', where plipX is the appropriate
|
||||
PLIP device.
|
||||
|
||||
PLIP hardware interconnection
|
||||
-----------------------------
|
||||
|
||||
PLIP uses several different data transfer methods. The first (and the
|
||||
only one implemented in the early version of the code) uses a standard
|
||||
printer "null" cable to transfer data four bits at a time using
|
||||
data bit outputs connected to status bit inputs.
|
||||
|
||||
The second data transfer method relies on both machines having
|
||||
bi-directional parallel ports, rather than output-only ``printer''
|
||||
ports. This allows byte-wide transfers and avoids reconstructing
|
||||
nibbles into bytes, leading to much faster transfers.
|
||||
|
||||
Parallel Transfer Mode 0 Cable
|
||||
==============================
|
||||
|
||||
The cable for the first transfer mode is a standard
|
||||
printer "null" cable which transfers data four bits at a time using
|
||||
data bit outputs of the first port (machine T) connected to the
|
||||
status bit inputs of the second port (machine R). There are five
|
||||
status inputs, and they are used as four data inputs and a clock (data
|
||||
strobe) input, arranged so that the data input bits appear as contiguous
|
||||
bits with standard status register implementation.
|
||||
|
||||
A cable that implements this protocol is available commercially as a
|
||||
"Null Printer" or "Turbo Laplink" cable. It can be constructed with
|
||||
two DB-25 male connectors symmetrically connected as follows:
|
||||
|
||||
STROBE output 1*
|
||||
D0->ERROR 2 - 15 15 - 2
|
||||
D1->SLCT 3 - 13 13 - 3
|
||||
D2->PAPOUT 4 - 12 12 - 4
|
||||
D3->ACK 5 - 10 10 - 5
|
||||
D4->BUSY 6 - 11 11 - 6
|
||||
D5,D6,D7 are 7*, 8*, 9*
|
||||
AUTOFD output 14*
|
||||
INIT output 16*
|
||||
SLCTIN 17 - 17
|
||||
extra grounds are 18*,19*,20*,21*,22*,23*,24*
|
||||
GROUND 25 - 25
|
||||
* Do not connect these pins on either end
|
||||
|
||||
If the cable you are using has a metallic shield it should be
|
||||
connected to the metallic DB-25 shell at one end only.
|
||||
|
||||
Parallel Transfer Mode 1
|
||||
========================
|
||||
|
||||
The second data transfer method relies on both machines having
|
||||
bi-directional parallel ports, rather than output-only ``printer''
|
||||
ports. This allows byte-wide transfers, and avoids reconstructing
|
||||
nibbles into bytes. This cable should not be used on unidirectional
|
||||
``printer'' (as opposed to ``parallel'') ports or when the machine
|
||||
isn't configured for PLIP, as it will result in output driver
|
||||
conflicts and the (unlikely) possibility of damage.
|
||||
|
||||
The cable for this transfer mode should be constructed as follows:
|
||||
|
||||
STROBE->BUSY 1 - 11
|
||||
D0->D0 2 - 2
|
||||
D1->D1 3 - 3
|
||||
D2->D2 4 - 4
|
||||
D3->D3 5 - 5
|
||||
D4->D4 6 - 6
|
||||
D5->D5 7 - 7
|
||||
D6->D6 8 - 8
|
||||
D7->D7 9 - 9
|
||||
INIT -> ACK 16 - 10
|
||||
AUTOFD->PAPOUT 14 - 12
|
||||
SLCT->SLCTIN 13 - 17
|
||||
GND->ERROR 18 - 15
|
||||
extra grounds are 19*,20*,21*,22*,23*,24*
|
||||
GROUND 25 - 25
|
||||
* Do not connect these pins on either end
|
||||
|
||||
Once again, if the cable you are using has a metallic shield it should
|
||||
be connected to the metallic DB-25 shell at one end only.
|
||||
|
||||
PLIP Mode 0 transfer protocol
|
||||
=============================
|
||||
|
||||
The PLIP driver is compatible with the "Crynwr" parallel port transfer
|
||||
standard in Mode 0. That standard specifies the following protocol:
|
||||
|
||||
send header nibble '0x8'
|
||||
count-low octet
|
||||
count-high octet
|
||||
... data octets
|
||||
checksum octet
|
||||
|
||||
Each octet is sent as
|
||||
<wait for rx. '0x1?'> <send 0x10+(octet&0x0F)>
|
||||
<wait for rx. '0x0?'> <send 0x00+((octet>>4)&0x0F)>
|
||||
|
||||
To start a transfer the transmitting machine outputs a nibble 0x08.
|
||||
That raises the ACK line, triggering an interrupt in the receiving
|
||||
machine. The receiving machine disables interrupts and raises its own ACK
|
||||
line.
|
||||
|
||||
Restated:
|
||||
|
||||
(OUT is bit 0-4, OUT.j is bit j from OUT. IN likewise)
|
||||
Send_Byte:
|
||||
OUT := low nibble, OUT.4 := 1
|
||||
WAIT FOR IN.4 = 1
|
||||
OUT := high nibble, OUT.4 := 0
|
||||
WAIT FOR IN.4 = 0
|
||||
293
Documentation/networking/README.ipw2100
Normal file
293
Documentation/networking/README.ipw2100
Normal file
|
|
@ -0,0 +1,293 @@
|
|||
|
||||
Intel(R) PRO/Wireless 2100 Driver for Linux in support of:
|
||||
|
||||
Intel(R) PRO/Wireless 2100 Network Connection
|
||||
|
||||
Copyright (C) 2003-2006, Intel Corporation
|
||||
|
||||
README.ipw2100
|
||||
|
||||
Version: git-1.1.5
|
||||
Date : January 25, 2006
|
||||
|
||||
Index
|
||||
-----------------------------------------------
|
||||
0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
|
||||
1. Introduction
|
||||
2. Release git-1.1.5 Current Features
|
||||
3. Command Line Parameters
|
||||
4. Sysfs Helper Files
|
||||
5. Radio Kill Switch
|
||||
6. Dynamic Firmware
|
||||
7. Power Management
|
||||
8. Support
|
||||
9. License
|
||||
|
||||
|
||||
0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
|
||||
-----------------------------------------------
|
||||
|
||||
Important Notice FOR ALL USERS OR DISTRIBUTORS!!!!
|
||||
|
||||
Intel wireless LAN adapters are engineered, manufactured, tested, and
|
||||
quality checked to ensure that they meet all necessary local and
|
||||
governmental regulatory agency requirements for the regions that they
|
||||
are designated and/or marked to ship into. Since wireless LANs are
|
||||
generally unlicensed devices that share spectrum with radars,
|
||||
satellites, and other licensed and unlicensed devices, it is sometimes
|
||||
necessary to dynamically detect, avoid, and limit usage to avoid
|
||||
interference with these devices. In many instances Intel is required to
|
||||
provide test data to prove regional and local compliance to regional and
|
||||
governmental regulations before certification or approval to use the
|
||||
product is granted. Intel's wireless LAN's EEPROM, firmware, and
|
||||
software driver are designed to carefully control parameters that affect
|
||||
radio operation and to ensure electromagnetic compliance (EMC). These
|
||||
parameters include, without limitation, RF power, spectrum usage,
|
||||
channel scanning, and human exposure.
|
||||
|
||||
For these reasons Intel cannot permit any manipulation by third parties
|
||||
of the software provided in binary format with the wireless WLAN
|
||||
adapters (e.g., the EEPROM and firmware). Furthermore, if you use any
|
||||
patches, utilities, or code with the Intel wireless LAN adapters that
|
||||
have been manipulated by an unauthorized party (i.e., patches,
|
||||
utilities, or code (including open source code modifications) which have
|
||||
not been validated by Intel), (i) you will be solely responsible for
|
||||
ensuring the regulatory compliance of the products, (ii) Intel will bear
|
||||
no liability, under any theory of liability for any issues associated
|
||||
with the modified products, including without limitation, claims under
|
||||
the warranty and/or issues arising from regulatory non-compliance, and
|
||||
(iii) Intel will not provide or be required to assist in providing
|
||||
support to any third parties for such modified products.
|
||||
|
||||
Note: Many regulatory agencies consider Wireless LAN adapters to be
|
||||
modules, and accordingly, condition system-level regulatory approval
|
||||
upon receipt and review of test data documenting that the antennas and
|
||||
system configuration do not cause the EMC and radio operation to be
|
||||
non-compliant.
|
||||
|
||||
The drivers available for download from SourceForge are provided as a
|
||||
part of a development project. Conformance to local regulatory
|
||||
requirements is the responsibility of the individual developer. As
|
||||
such, if you are interested in deploying or shipping a driver as part of
|
||||
solution intended to be used for purposes other than development, please
|
||||
obtain a tested driver from Intel Customer Support at:
|
||||
|
||||
http://www.intel.com/support/wireless/sb/CS-006408.htm
|
||||
|
||||
1. Introduction
|
||||
-----------------------------------------------
|
||||
|
||||
This document provides a brief overview of the features supported by the
|
||||
IPW2100 driver project. The main project website, where the latest
|
||||
development version of the driver can be found, is:
|
||||
|
||||
http://ipw2100.sourceforge.net
|
||||
|
||||
There you can find the not only the latest releases, but also information about
|
||||
potential fixes and patches, as well as links to the development mailing list
|
||||
for the driver project.
|
||||
|
||||
|
||||
2. Release git-1.1.5 Current Supported Features
|
||||
-----------------------------------------------
|
||||
- Managed (BSS) and Ad-Hoc (IBSS)
|
||||
- WEP (shared key and open)
|
||||
- Wireless Tools support
|
||||
- 802.1x (tested with XSupplicant 1.0.1)
|
||||
|
||||
Enabled (but not supported) features:
|
||||
- Monitor/RFMon mode
|
||||
- WPA/WPA2
|
||||
|
||||
The distinction between officially supported and enabled is a reflection
|
||||
on the amount of validation and interoperability testing that has been
|
||||
performed on a given feature.
|
||||
|
||||
|
||||
3. Command Line Parameters
|
||||
-----------------------------------------------
|
||||
|
||||
If the driver is built as a module, the following optional parameters are used
|
||||
by entering them on the command line with the modprobe command using this
|
||||
syntax:
|
||||
|
||||
modprobe ipw2100 [<option>=<VAL1><,VAL2>...]
|
||||
|
||||
For example, to disable the radio on driver loading, enter:
|
||||
|
||||
modprobe ipw2100 disable=1
|
||||
|
||||
The ipw2100 driver supports the following module parameters:
|
||||
|
||||
Name Value Example:
|
||||
debug 0x0-0xffffffff debug=1024
|
||||
mode 0,1,2 mode=1 /* AdHoc */
|
||||
channel int channel=3 /* Only valid in AdHoc or Monitor */
|
||||
associate boolean associate=0 /* Do NOT auto associate */
|
||||
disable boolean disable=1 /* Do not power the HW */
|
||||
|
||||
|
||||
4. Sysfs Helper Files
|
||||
---------------------------
|
||||
-----------------------------------------------
|
||||
|
||||
There are several ways to control the behavior of the driver. Many of the
|
||||
general capabilities are exposed through the Wireless Tools (iwconfig). There
|
||||
are a few capabilities that are exposed through entries in the Linux Sysfs.
|
||||
|
||||
|
||||
----- Driver Level ------
|
||||
For the driver level files, look in /sys/bus/pci/drivers/ipw2100/
|
||||
|
||||
debug_level
|
||||
|
||||
This controls the same global as the 'debug' module parameter. For
|
||||
information on the various debugging levels available, run the 'dvals'
|
||||
script found in the driver source directory.
|
||||
|
||||
NOTE: 'debug_level' is only enabled if CONFIG_IPW2100_DEBUG is turn
|
||||
on.
|
||||
|
||||
----- Device Level ------
|
||||
For the device level files look in
|
||||
|
||||
/sys/bus/pci/drivers/ipw2100/{PCI-ID}/
|
||||
|
||||
For example:
|
||||
/sys/bus/pci/drivers/ipw2100/0000:02:01.0
|
||||
|
||||
For the device level files, see /sys/bus/pci/drivers/ipw2100:
|
||||
|
||||
rf_kill
|
||||
read -
|
||||
0 = RF kill not enabled (radio on)
|
||||
1 = SW based RF kill active (radio off)
|
||||
2 = HW based RF kill active (radio off)
|
||||
3 = Both HW and SW RF kill active (radio off)
|
||||
write -
|
||||
0 = If SW based RF kill active, turn the radio back on
|
||||
1 = If radio is on, activate SW based RF kill
|
||||
|
||||
NOTE: If you enable the SW based RF kill and then toggle the HW
|
||||
based RF kill from ON -> OFF -> ON, the radio will NOT come back on
|
||||
|
||||
|
||||
5. Radio Kill Switch
|
||||
-----------------------------------------------
|
||||
Most laptops provide the ability for the user to physically disable the radio.
|
||||
Some vendors have implemented this as a physical switch that requires no
|
||||
software to turn the radio off and on. On other laptops, however, the switch
|
||||
is controlled through a button being pressed and a software driver then making
|
||||
calls to turn the radio off and on. This is referred to as a "software based
|
||||
RF kill switch"
|
||||
|
||||
See the Sysfs helper file 'rf_kill' for determining the state of the RF switch
|
||||
on your system.
|
||||
|
||||
|
||||
6. Dynamic Firmware
|
||||
-----------------------------------------------
|
||||
As the firmware is licensed under a restricted use license, it can not be
|
||||
included within the kernel sources. To enable the IPW2100 you will need a
|
||||
firmware image to load into the wireless NIC's processors.
|
||||
|
||||
You can obtain these images from <http://ipw2100.sf.net/firmware.php>.
|
||||
|
||||
See INSTALL for instructions on installing the firmware.
|
||||
|
||||
|
||||
7. Power Management
|
||||
-----------------------------------------------
|
||||
The IPW2100 supports the configuration of the Power Save Protocol
|
||||
through a private wireless extension interface. The IPW2100 supports
|
||||
the following different modes:
|
||||
|
||||
off No power management. Radio is always on.
|
||||
on Automatic power management
|
||||
1-5 Different levels of power management. The higher the
|
||||
number the greater the power savings, but with an impact to
|
||||
packet latencies.
|
||||
|
||||
Power management works by powering down the radio after a certain
|
||||
interval of time has passed where no packets are passed through the
|
||||
radio. Once powered down, the radio remains in that state for a given
|
||||
period of time. For higher power savings, the interval between last
|
||||
packet processed to sleep is shorter and the sleep period is longer.
|
||||
|
||||
When the radio is asleep, the access point sending data to the station
|
||||
must buffer packets at the AP until the station wakes up and requests
|
||||
any buffered packets. If you have an AP that does not correctly support
|
||||
the PSP protocol you may experience packet loss or very poor performance
|
||||
while power management is enabled. If this is the case, you will need
|
||||
to try and find a firmware update for your AP, or disable power
|
||||
management (via `iwconfig eth1 power off`)
|
||||
|
||||
To configure the power level on the IPW2100 you use a combination of
|
||||
iwconfig and iwpriv. iwconfig is used to turn power management on, off,
|
||||
and set it to auto.
|
||||
|
||||
iwconfig eth1 power off Disables radio power down
|
||||
iwconfig eth1 power on Enables radio power management to
|
||||
last set level (defaults to AUTO)
|
||||
iwpriv eth1 set_power 0 Sets power level to AUTO and enables
|
||||
power management if not previously
|
||||
enabled.
|
||||
iwpriv eth1 set_power 1-5 Set the power level as specified,
|
||||
enabling power management if not
|
||||
previously enabled.
|
||||
|
||||
You can view the current power level setting via:
|
||||
|
||||
iwpriv eth1 get_power
|
||||
|
||||
It will return the current period or timeout that is configured as a string
|
||||
in the form of xxxx/yyyy (z) where xxxx is the timeout interval (amount of
|
||||
time after packet processing), yyyy is the period to sleep (amount of time to
|
||||
wait before powering the radio and querying the access point for buffered
|
||||
packets), and z is the 'power level'. If power management is turned off the
|
||||
xxxx/yyyy will be replaced with 'off' -- the level reported will be the active
|
||||
level if `iwconfig eth1 power on` is invoked.
|
||||
|
||||
|
||||
8. Support
|
||||
-----------------------------------------------
|
||||
|
||||
For general development information and support,
|
||||
go to:
|
||||
|
||||
http://ipw2100.sf.net/
|
||||
|
||||
The ipw2100 1.1.0 driver and firmware can be downloaded from:
|
||||
|
||||
http://support.intel.com
|
||||
|
||||
For installation support on the ipw2100 1.1.0 driver on Linux kernels
|
||||
2.6.8 or greater, email support is available from:
|
||||
|
||||
http://supportmail.intel.com
|
||||
|
||||
9. License
|
||||
-----------------------------------------------
|
||||
|
||||
Copyright(c) 2003 - 2006 Intel Corporation. All rights reserved.
|
||||
|
||||
This program is free software; you can redistribute it and/or modify it
|
||||
under the terms of the GNU General Public License (version 2) as
|
||||
published by the Free Software Foundation.
|
||||
|
||||
This program is distributed in the hope that it will be useful, but WITHOUT
|
||||
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
||||
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
|
||||
more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License along with
|
||||
this program; if not, write to the Free Software Foundation, Inc., 59
|
||||
Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
|
||||
The full GNU General Public License is included in this distribution in the
|
||||
file called LICENSE.
|
||||
|
||||
License Contact Information:
|
||||
James P. Ketrenos <ipw2100-admin@linux.intel.com>
|
||||
Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
|
||||
|
||||
472
Documentation/networking/README.ipw2200
Normal file
472
Documentation/networking/README.ipw2200
Normal file
|
|
@ -0,0 +1,472 @@
|
|||
|
||||
Intel(R) PRO/Wireless 2915ABG Driver for Linux in support of:
|
||||
|
||||
Intel(R) PRO/Wireless 2200BG Network Connection
|
||||
Intel(R) PRO/Wireless 2915ABG Network Connection
|
||||
|
||||
Note: The Intel(R) PRO/Wireless 2915ABG Driver for Linux and Intel(R)
|
||||
PRO/Wireless 2200BG Driver for Linux is a unified driver that works on
|
||||
both hardware adapters listed above. In this document the Intel(R)
|
||||
PRO/Wireless 2915ABG Driver for Linux will be used to reference the
|
||||
unified driver.
|
||||
|
||||
Copyright (C) 2004-2006, Intel Corporation
|
||||
|
||||
README.ipw2200
|
||||
|
||||
Version: 1.1.2
|
||||
Date : March 30, 2006
|
||||
|
||||
|
||||
Index
|
||||
-----------------------------------------------
|
||||
0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
|
||||
1. Introduction
|
||||
1.1. Overview of features
|
||||
1.2. Module parameters
|
||||
1.3. Wireless Extension Private Methods
|
||||
1.4. Sysfs Helper Files
|
||||
1.5. Supported channels
|
||||
2. Ad-Hoc Networking
|
||||
3. Interacting with Wireless Tools
|
||||
3.1. iwconfig mode
|
||||
3.2. iwconfig sens
|
||||
4. About the Version Numbers
|
||||
5. Firmware installation
|
||||
6. Support
|
||||
7. License
|
||||
|
||||
|
||||
0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
|
||||
-----------------------------------------------
|
||||
|
||||
Important Notice FOR ALL USERS OR DISTRIBUTORS!!!!
|
||||
|
||||
Intel wireless LAN adapters are engineered, manufactured, tested, and
|
||||
quality checked to ensure that they meet all necessary local and
|
||||
governmental regulatory agency requirements for the regions that they
|
||||
are designated and/or marked to ship into. Since wireless LANs are
|
||||
generally unlicensed devices that share spectrum with radars,
|
||||
satellites, and other licensed and unlicensed devices, it is sometimes
|
||||
necessary to dynamically detect, avoid, and limit usage to avoid
|
||||
interference with these devices. In many instances Intel is required to
|
||||
provide test data to prove regional and local compliance to regional and
|
||||
governmental regulations before certification or approval to use the
|
||||
product is granted. Intel's wireless LAN's EEPROM, firmware, and
|
||||
software driver are designed to carefully control parameters that affect
|
||||
radio operation and to ensure electromagnetic compliance (EMC). These
|
||||
parameters include, without limitation, RF power, spectrum usage,
|
||||
channel scanning, and human exposure.
|
||||
|
||||
For these reasons Intel cannot permit any manipulation by third parties
|
||||
of the software provided in binary format with the wireless WLAN
|
||||
adapters (e.g., the EEPROM and firmware). Furthermore, if you use any
|
||||
patches, utilities, or code with the Intel wireless LAN adapters that
|
||||
have been manipulated by an unauthorized party (i.e., patches,
|
||||
utilities, or code (including open source code modifications) which have
|
||||
not been validated by Intel), (i) you will be solely responsible for
|
||||
ensuring the regulatory compliance of the products, (ii) Intel will bear
|
||||
no liability, under any theory of liability for any issues associated
|
||||
with the modified products, including without limitation, claims under
|
||||
the warranty and/or issues arising from regulatory non-compliance, and
|
||||
(iii) Intel will not provide or be required to assist in providing
|
||||
support to any third parties for such modified products.
|
||||
|
||||
Note: Many regulatory agencies consider Wireless LAN adapters to be
|
||||
modules, and accordingly, condition system-level regulatory approval
|
||||
upon receipt and review of test data documenting that the antennas and
|
||||
system configuration do not cause the EMC and radio operation to be
|
||||
non-compliant.
|
||||
|
||||
The drivers available for download from SourceForge are provided as a
|
||||
part of a development project. Conformance to local regulatory
|
||||
requirements is the responsibility of the individual developer. As
|
||||
such, if you are interested in deploying or shipping a driver as part of
|
||||
solution intended to be used for purposes other than development, please
|
||||
obtain a tested driver from Intel Customer Support at:
|
||||
|
||||
http://support.intel.com
|
||||
|
||||
|
||||
1. Introduction
|
||||
-----------------------------------------------
|
||||
The following sections attempt to provide a brief introduction to using
|
||||
the Intel(R) PRO/Wireless 2915ABG Driver for Linux.
|
||||
|
||||
This document is not meant to be a comprehensive manual on
|
||||
understanding or using wireless technologies, but should be sufficient
|
||||
to get you moving without wires on Linux.
|
||||
|
||||
For information on building and installing the driver, see the INSTALL
|
||||
file.
|
||||
|
||||
|
||||
1.1. Overview of Features
|
||||
-----------------------------------------------
|
||||
The current release (1.1.2) supports the following features:
|
||||
|
||||
+ BSS mode (Infrastructure, Managed)
|
||||
+ IBSS mode (Ad-Hoc)
|
||||
+ WEP (OPEN and SHARED KEY mode)
|
||||
+ 802.1x EAP via wpa_supplicant and xsupplicant
|
||||
+ Wireless Extension support
|
||||
+ Full B and G rate support (2200 and 2915)
|
||||
+ Full A rate support (2915 only)
|
||||
+ Transmit power control
|
||||
+ S state support (ACPI suspend/resume)
|
||||
|
||||
The following features are currently enabled, but not officially
|
||||
supported:
|
||||
|
||||
+ WPA
|
||||
+ long/short preamble support
|
||||
+ Monitor mode (aka RFMon)
|
||||
|
||||
The distinction between officially supported and enabled is a reflection
|
||||
on the amount of validation and interoperability testing that has been
|
||||
performed on a given feature.
|
||||
|
||||
|
||||
|
||||
1.2. Command Line Parameters
|
||||
-----------------------------------------------
|
||||
|
||||
Like many modules used in the Linux kernel, the Intel(R) PRO/Wireless
|
||||
2915ABG Driver for Linux allows configuration options to be provided
|
||||
as module parameters. The most common way to specify a module parameter
|
||||
is via the command line.
|
||||
|
||||
The general form is:
|
||||
|
||||
% modprobe ipw2200 parameter=value
|
||||
|
||||
Where the supported parameter are:
|
||||
|
||||
associate
|
||||
Set to 0 to disable the auto scan-and-associate functionality of the
|
||||
driver. If disabled, the driver will not attempt to scan
|
||||
for and associate to a network until it has been configured with
|
||||
one or more properties for the target network, for example configuring
|
||||
the network SSID. Default is 0 (do not auto-associate)
|
||||
|
||||
Example: % modprobe ipw2200 associate=0
|
||||
|
||||
auto_create
|
||||
Set to 0 to disable the auto creation of an Ad-Hoc network
|
||||
matching the channel and network name parameters provided.
|
||||
Default is 1.
|
||||
|
||||
channel
|
||||
channel number for association. The normal method for setting
|
||||
the channel would be to use the standard wireless tools
|
||||
(i.e. `iwconfig eth1 channel 10`), but it is useful sometimes
|
||||
to set this while debugging. Channel 0 means 'ANY'
|
||||
|
||||
debug
|
||||
If using a debug build, this is used to control the amount of debug
|
||||
info is logged. See the 'dvals' and 'load' script for more info on
|
||||
how to use this (the dvals and load scripts are provided as part
|
||||
of the ipw2200 development snapshot releases available from the
|
||||
SourceForge project at http://ipw2200.sf.net)
|
||||
|
||||
led
|
||||
Can be used to turn on experimental LED code.
|
||||
0 = Off, 1 = On. Default is 1.
|
||||
|
||||
mode
|
||||
Can be used to set the default mode of the adapter.
|
||||
0 = Managed, 1 = Ad-Hoc, 2 = Monitor
|
||||
|
||||
|
||||
1.3. Wireless Extension Private Methods
|
||||
-----------------------------------------------
|
||||
|
||||
As an interface designed to handle generic hardware, there are certain
|
||||
capabilities not exposed through the normal Wireless Tool interface. As
|
||||
such, a provision is provided for a driver to declare custom, or
|
||||
private, methods. The Intel(R) PRO/Wireless 2915ABG Driver for Linux
|
||||
defines several of these to configure various settings.
|
||||
|
||||
The general form of using the private wireless methods is:
|
||||
|
||||
% iwpriv $IFNAME method parameters
|
||||
|
||||
Where $IFNAME is the interface name the device is registered with
|
||||
(typically eth1, customized via one of the various network interface
|
||||
name managers, such as ifrename)
|
||||
|
||||
The supported private methods are:
|
||||
|
||||
get_mode
|
||||
Can be used to report out which IEEE mode the driver is
|
||||
configured to support. Example:
|
||||
|
||||
% iwpriv eth1 get_mode
|
||||
eth1 get_mode:802.11bg (6)
|
||||
|
||||
set_mode
|
||||
Can be used to configure which IEEE mode the driver will
|
||||
support.
|
||||
|
||||
Usage:
|
||||
% iwpriv eth1 set_mode {mode}
|
||||
Where {mode} is a number in the range 1-7:
|
||||
1 802.11a (2915 only)
|
||||
2 802.11b
|
||||
3 802.11ab (2915 only)
|
||||
4 802.11g
|
||||
5 802.11ag (2915 only)
|
||||
6 802.11bg
|
||||
7 802.11abg (2915 only)
|
||||
|
||||
get_preamble
|
||||
Can be used to report configuration of preamble length.
|
||||
|
||||
set_preamble
|
||||
Can be used to set the configuration of preamble length:
|
||||
|
||||
Usage:
|
||||
% iwpriv eth1 set_preamble {mode}
|
||||
Where {mode} is one of:
|
||||
1 Long preamble only
|
||||
0 Auto (long or short based on connection)
|
||||
|
||||
|
||||
1.4. Sysfs Helper Files:
|
||||
-----------------------------------------------
|
||||
|
||||
The Linux kernel provides a pseudo file system that can be used to
|
||||
access various components of the operating system. The Intel(R)
|
||||
PRO/Wireless 2915ABG Driver for Linux exposes several configuration
|
||||
parameters through this mechanism.
|
||||
|
||||
An entry in the sysfs can support reading and/or writing. You can
|
||||
typically query the contents of a sysfs entry through the use of cat,
|
||||
and can set the contents via echo. For example:
|
||||
|
||||
% cat /sys/bus/pci/drivers/ipw2200/debug_level
|
||||
|
||||
Will report the current debug level of the driver's logging subsystem
|
||||
(only available if CONFIG_IPW2200_DEBUG was configured when the driver
|
||||
was built).
|
||||
|
||||
You can set the debug level via:
|
||||
|
||||
% echo $VALUE > /sys/bus/pci/drivers/ipw2200/debug_level
|
||||
|
||||
Where $VALUE would be a number in the case of this sysfs entry. The
|
||||
input to sysfs files does not have to be a number. For example, the
|
||||
firmware loader used by hotplug utilizes sysfs entries for transferring
|
||||
the firmware image from user space into the driver.
|
||||
|
||||
The Intel(R) PRO/Wireless 2915ABG Driver for Linux exposes sysfs entries
|
||||
at two levels -- driver level, which apply to all instances of the driver
|
||||
(in the event that there are more than one device installed) and device
|
||||
level, which applies only to the single specific instance.
|
||||
|
||||
|
||||
1.4.1 Driver Level Sysfs Helper Files
|
||||
-----------------------------------------------
|
||||
|
||||
For the driver level files, look in /sys/bus/pci/drivers/ipw2200/
|
||||
|
||||
debug_level
|
||||
|
||||
This controls the same global as the 'debug' module parameter
|
||||
|
||||
|
||||
|
||||
1.4.2 Device Level Sysfs Helper Files
|
||||
-----------------------------------------------
|
||||
|
||||
For the device level files, look in
|
||||
|
||||
/sys/bus/pci/drivers/ipw2200/{PCI-ID}/
|
||||
|
||||
For example:
|
||||
/sys/bus/pci/drivers/ipw2200/0000:02:01.0
|
||||
|
||||
For the device level files, see /sys/bus/pci/drivers/ipw2200:
|
||||
|
||||
rf_kill
|
||||
read -
|
||||
0 = RF kill not enabled (radio on)
|
||||
1 = SW based RF kill active (radio off)
|
||||
2 = HW based RF kill active (radio off)
|
||||
3 = Both HW and SW RF kill active (radio off)
|
||||
write -
|
||||
0 = If SW based RF kill active, turn the radio back on
|
||||
1 = If radio is on, activate SW based RF kill
|
||||
|
||||
NOTE: If you enable the SW based RF kill and then toggle the HW
|
||||
based RF kill from ON -> OFF -> ON, the radio will NOT come back on
|
||||
|
||||
ucode
|
||||
read-only access to the ucode version number
|
||||
|
||||
led
|
||||
read -
|
||||
0 = LED code disabled
|
||||
1 = LED code enabled
|
||||
write -
|
||||
0 = Disable LED code
|
||||
1 = Enable LED code
|
||||
|
||||
NOTE: The LED code has been reported to hang some systems when
|
||||
running ifconfig and is therefore disabled by default.
|
||||
|
||||
|
||||
1.5. Supported channels
|
||||
-----------------------------------------------
|
||||
|
||||
Upon loading the Intel(R) PRO/Wireless 2915ABG Driver for Linux, a
|
||||
message stating the detected geography code and the number of 802.11
|
||||
channels supported by the card will be displayed in the log.
|
||||
|
||||
The geography code corresponds to a regulatory domain as shown in the
|
||||
table below.
|
||||
|
||||
Supported channels
|
||||
Code Geography 802.11bg 802.11a
|
||||
|
||||
--- Restricted 11 0
|
||||
ZZF Custom US/Canada 11 8
|
||||
ZZD Rest of World 13 0
|
||||
ZZA Custom USA & Europe & High 11 13
|
||||
ZZB Custom NA & Europe 11 13
|
||||
ZZC Custom Japan 11 4
|
||||
ZZM Custom 11 0
|
||||
ZZE Europe 13 19
|
||||
ZZJ Custom Japan 14 4
|
||||
ZZR Rest of World 14 0
|
||||
ZZH High Band 13 4
|
||||
ZZG Custom Europe 13 4
|
||||
ZZK Europe 13 24
|
||||
ZZL Europe 11 13
|
||||
|
||||
|
||||
2. Ad-Hoc Networking
|
||||
-----------------------------------------------
|
||||
|
||||
When using a device in an Ad-Hoc network, it is useful to understand the
|
||||
sequence and requirements for the driver to be able to create, join, or
|
||||
merge networks.
|
||||
|
||||
The following attempts to provide enough information so that you can
|
||||
have a consistent experience while using the driver as a member of an
|
||||
Ad-Hoc network.
|
||||
|
||||
2.1. Joining an Ad-Hoc Network
|
||||
-----------------------------------------------
|
||||
|
||||
The easiest way to get onto an Ad-Hoc network is to join one that
|
||||
already exists.
|
||||
|
||||
2.2. Creating an Ad-Hoc Network
|
||||
-----------------------------------------------
|
||||
|
||||
An Ad-Hoc networks is created using the syntax of the Wireless tool.
|
||||
|
||||
For Example:
|
||||
iwconfig eth1 mode ad-hoc essid testing channel 2
|
||||
|
||||
2.3. Merging Ad-Hoc Networks
|
||||
-----------------------------------------------
|
||||
|
||||
|
||||
3. Interaction with Wireless Tools
|
||||
-----------------------------------------------
|
||||
|
||||
3.1 iwconfig mode
|
||||
-----------------------------------------------
|
||||
|
||||
When configuring the mode of the adapter, all run-time configured parameters
|
||||
are reset to the value used when the module was loaded. This includes
|
||||
channels, rates, ESSID, etc.
|
||||
|
||||
3.2 iwconfig sens
|
||||
-----------------------------------------------
|
||||
|
||||
The 'iwconfig ethX sens XX' command will not set the signal sensitivity
|
||||
threshold, as described in iwconfig documentation, but rather the number
|
||||
of consecutive missed beacons that will trigger handover, i.e. roaming
|
||||
to another access point. At the same time, it will set the disassociation
|
||||
threshold to 3 times the given value.
|
||||
|
||||
|
||||
4. About the Version Numbers
|
||||
-----------------------------------------------
|
||||
|
||||
Due to the nature of open source development projects, there are
|
||||
frequently changes being incorporated that have not gone through
|
||||
a complete validation process. These changes are incorporated into
|
||||
development snapshot releases.
|
||||
|
||||
Releases are numbered with a three level scheme:
|
||||
|
||||
major.minor.development
|
||||
|
||||
Any version where the 'development' portion is 0 (for example
|
||||
1.0.0, 1.1.0, etc.) indicates a stable version that will be made
|
||||
available for kernel inclusion.
|
||||
|
||||
Any version where the 'development' portion is not a 0 (for
|
||||
example 1.0.1, 1.1.5, etc.) indicates a development version that is
|
||||
being made available for testing and cutting edge users. The stability
|
||||
and functionality of the development releases are not know. We make
|
||||
efforts to try and keep all snapshots reasonably stable, but due to the
|
||||
frequency of their release, and the desire to get those releases
|
||||
available as quickly as possible, unknown anomalies should be expected.
|
||||
|
||||
The major version number will be incremented when significant changes
|
||||
are made to the driver. Currently, there are no major changes planned.
|
||||
|
||||
5. Firmware installation
|
||||
----------------------------------------------
|
||||
|
||||
The driver requires a firmware image, download it and extract the
|
||||
files under /lib/firmware (or wherever your hotplug's firmware.agent
|
||||
will look for firmware files)
|
||||
|
||||
The firmware can be downloaded from the following URL:
|
||||
|
||||
http://ipw2200.sf.net/
|
||||
|
||||
|
||||
6. Support
|
||||
-----------------------------------------------
|
||||
|
||||
For direct support of the 1.0.0 version, you can contact
|
||||
http://supportmail.intel.com, or you can use the open source project
|
||||
support.
|
||||
|
||||
For general information and support, go to:
|
||||
|
||||
http://ipw2200.sf.net/
|
||||
|
||||
|
||||
7. License
|
||||
-----------------------------------------------
|
||||
|
||||
Copyright(c) 2003 - 2006 Intel Corporation. All rights reserved.
|
||||
|
||||
This program is free software; you can redistribute it and/or modify it
|
||||
under the terms of the GNU General Public License version 2 as
|
||||
published by the Free Software Foundation.
|
||||
|
||||
This program is distributed in the hope that it will be useful, but WITHOUT
|
||||
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
||||
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
|
||||
more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License along with
|
||||
this program; if not, write to the Free Software Foundation, Inc., 59
|
||||
Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
|
||||
The full GNU General Public License is included in this distribution in the
|
||||
file called LICENSE.
|
||||
|
||||
Contact Information:
|
||||
James P. Ketrenos <ipw2100-admin@linux.intel.com>
|
||||
Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
|
||||
|
||||
207
Documentation/networking/README.sb1000
Normal file
207
Documentation/networking/README.sb1000
Normal file
|
|
@ -0,0 +1,207 @@
|
|||
sb1000 is a module network device driver for the General Instrument (also known
|
||||
as NextLevel) SURFboard1000 internal cable modem board. This is an ISA card
|
||||
which is used by a number of cable TV companies to provide cable modem access.
|
||||
It's a one-way downstream-only cable modem, meaning that your upstream net link
|
||||
is provided by your regular phone modem.
|
||||
|
||||
This driver was written by Franco Venturi <fventuri@mediaone.net>. He deserves
|
||||
a great deal of thanks for this wonderful piece of code!
|
||||
|
||||
-----------------------------------------------------------------------------
|
||||
|
||||
Support for this device is now a part of the standard Linux kernel. The
|
||||
driver source code file is drivers/net/sb1000.c. In addition to this
|
||||
you will need:
|
||||
|
||||
1.) The "cmconfig" program. This is a utility which supplements "ifconfig"
|
||||
to configure the cable modem and network interface (usually called "cm0");
|
||||
and
|
||||
|
||||
2.) Several PPP scripts which live in /etc/ppp to make connecting via your
|
||||
cable modem easy.
|
||||
|
||||
These utilities can be obtained from:
|
||||
|
||||
http://www.jacksonville.net/~fventuri/
|
||||
|
||||
in Franco's original source code distribution .tar.gz file. Support for
|
||||
the sb1000 driver can be found at:
|
||||
|
||||
http://web.archive.org/web/*/http://home.adelphia.net/~siglercm/sb1000.html
|
||||
http://web.archive.org/web/*/http://linuxpower.cx/~cable/
|
||||
|
||||
along with these utilities.
|
||||
|
||||
3.) The standard isapnp tools. These are necessary to configure your SB1000
|
||||
card at boot time (or afterwards by hand) since it's a PnP card.
|
||||
|
||||
If you don't have these installed as a standard part of your Linux
|
||||
distribution, you can find them at:
|
||||
|
||||
http://www.roestock.demon.co.uk/isapnptools/
|
||||
|
||||
or check your Linux distribution binary CD or their web site. For help with
|
||||
isapnp, pnpdump, or /etc/isapnp.conf, go to:
|
||||
|
||||
http://www.roestock.demon.co.uk/isapnptools/isapnpfaq.html
|
||||
|
||||
-----------------------------------------------------------------------------
|
||||
|
||||
To make the SB1000 card work, follow these steps:
|
||||
|
||||
1.) Run `make config', or `make menuconfig', or `make xconfig', whichever
|
||||
you prefer, in the top kernel tree directory to set up your kernel
|
||||
configuration. Make sure to say "Y" to "Prompt for development drivers"
|
||||
and to say "M" to the sb1000 driver. Also say "Y" or "M" to all the standard
|
||||
networking questions to get TCP/IP and PPP networking support.
|
||||
|
||||
2.) *BEFORE* you build the kernel, edit drivers/net/sb1000.c. Make sure
|
||||
to redefine the value of READ_DATA_PORT to match the I/O address used
|
||||
by isapnp to access your PnP cards. This is the value of READPORT in
|
||||
/etc/isapnp.conf or given by the output of pnpdump.
|
||||
|
||||
3.) Build and install the kernel and modules as usual.
|
||||
|
||||
4.) Boot your new kernel following the usual procedures.
|
||||
|
||||
5.) Set up to configure the new SB1000 PnP card by capturing the output
|
||||
of "pnpdump" to a file and editing this file to set the correct I/O ports,
|
||||
IRQ, and DMA settings for all your PnP cards. Make sure none of the settings
|
||||
conflict with one another. Then test this configuration by running the
|
||||
"isapnp" command with your new config file as the input. Check for
|
||||
errors and fix as necessary. (As an aside, I use I/O ports 0x110 and
|
||||
0x310 and IRQ 11 for my SB1000 card and these work well for me. YMMV.)
|
||||
Then save the finished config file as /etc/isapnp.conf for proper configuration
|
||||
on subsequent reboots.
|
||||
|
||||
6.) Download the original file sb1000-1.1.2.tar.gz from Franco's site or one of
|
||||
the others referenced above. As root, unpack it into a temporary directory and
|
||||
do a `make cmconfig' and then `install -c cmconfig /usr/local/sbin'. Don't do
|
||||
`make install' because it expects to find all the utilities built and ready for
|
||||
installation, not just cmconfig.
|
||||
|
||||
7.) As root, copy all the files under the ppp/ subdirectory in Franco's
|
||||
tar file into /etc/ppp, being careful not to overwrite any files that are
|
||||
already in there. Then modify ppp@gi-on to set the correct login name,
|
||||
phone number, and frequency for the cable modem. Also edit pap-secrets
|
||||
to specify your login name and password and any site-specific information
|
||||
you need.
|
||||
|
||||
8.) Be sure to modify /etc/ppp/firewall to use ipchains instead of
|
||||
the older ipfwadm commands from the 2.0.x kernels. There's a neat utility to
|
||||
convert ipfwadm commands to ipchains commands:
|
||||
|
||||
http://users.dhp.com/~whisper/ipfwadm2ipchains/
|
||||
|
||||
You may also wish to modify the firewall script to implement a different
|
||||
firewalling scheme.
|
||||
|
||||
9.) Start the PPP connection via the script /etc/ppp/ppp@gi-on. You must be
|
||||
root to do this. It's better to use a utility like sudo to execute
|
||||
frequently used commands like this with root permissions if possible. If you
|
||||
connect successfully the cable modem interface will come up and you'll see a
|
||||
driver message like this at the console:
|
||||
|
||||
cm0: sb1000 at (0x110,0x310), csn 1, S/N 0x2a0d16d8, IRQ 11.
|
||||
sb1000.c:v1.1.2 6/01/98 (fventuri@mediaone.net)
|
||||
|
||||
The "ifconfig" command should show two new interfaces, ppp0 and cm0.
|
||||
The command "cmconfig cm0" will give you information about the cable modem
|
||||
interface.
|
||||
|
||||
10.) Try pinging a site via `ping -c 5 www.yahoo.com', for example. You should
|
||||
see packets received.
|
||||
|
||||
11.) If you can't get site names (like www.yahoo.com) to resolve into
|
||||
IP addresses (like 204.71.200.67), be sure your /etc/resolv.conf file
|
||||
has no syntax errors and has the right nameserver IP addresses in it.
|
||||
If this doesn't help, try something like `ping -c 5 204.71.200.67' to
|
||||
see if the networking is running but the DNS resolution is where the
|
||||
problem lies.
|
||||
|
||||
12.) If you still have problems, go to the support web sites mentioned above
|
||||
and read the information and documentation there.
|
||||
|
||||
-----------------------------------------------------------------------------
|
||||
|
||||
Common problems:
|
||||
|
||||
1.) Packets go out on the ppp0 interface but don't come back on the cm0
|
||||
interface. It looks like I'm connected but I can't even ping any
|
||||
numerical IP addresses. (This happens predominantly on Debian systems due
|
||||
to a default boot-time configuration script.)
|
||||
|
||||
Solution -- As root `echo 0 > /proc/sys/net/ipv4/conf/cm0/rp_filter' so it
|
||||
can share the same IP address as the ppp0 interface. Note that this
|
||||
command should probably be added to the /etc/ppp/cablemodem script
|
||||
*right*between* the "/sbin/ifconfig" and "/sbin/cmconfig" commands.
|
||||
You may need to do this to /proc/sys/net/ipv4/conf/ppp0/rp_filter as well.
|
||||
If you do this to /proc/sys/net/ipv4/conf/default/rp_filter on each reboot
|
||||
(in rc.local or some such) then any interfaces can share the same IP
|
||||
addresses.
|
||||
|
||||
2.) I get "unresolved symbol" error messages on executing `insmod sb1000.o'.
|
||||
|
||||
Solution -- You probably have a non-matching kernel source tree and
|
||||
/usr/include/linux and /usr/include/asm header files. Make sure you
|
||||
install the correct versions of the header files in these two directories.
|
||||
Then rebuild and reinstall the kernel.
|
||||
|
||||
3.) When isapnp runs it reports an error, and my SB1000 card isn't working.
|
||||
|
||||
Solution -- There's a problem with later versions of isapnp using the "(CHECK)"
|
||||
option in the lines that allocate the two I/O addresses for the SB1000 card.
|
||||
This first popped up on RH 6.0. Delete "(CHECK)" for the SB1000 I/O addresses.
|
||||
Make sure they don't conflict with any other pieces of hardware first! Then
|
||||
rerun isapnp and go from there.
|
||||
|
||||
4.) I can't execute the /etc/ppp/ppp@gi-on file.
|
||||
|
||||
Solution -- As root do `chmod ug+x /etc/ppp/ppp@gi-on'.
|
||||
|
||||
5.) The firewall script isn't working (with 2.2.x and higher kernels).
|
||||
|
||||
Solution -- Use the ipfwadm2ipchains script referenced above to convert the
|
||||
/etc/ppp/firewall script from the deprecated ipfwadm commands to ipchains.
|
||||
|
||||
6.) I'm getting *tons* of firewall deny messages in the /var/kern.log,
|
||||
/var/messages, and/or /var/syslog files, and they're filling up my /var
|
||||
partition!!!
|
||||
|
||||
Solution -- First, tell your ISP that you're receiving DoS (Denial of Service)
|
||||
and/or portscanning (UDP connection attempts) attacks! Look over the deny
|
||||
messages to figure out what the attack is and where it's coming from. Next,
|
||||
edit /etc/ppp/cablemodem and make sure the ",nobroadcast" option is turned on
|
||||
to the "cmconfig" command (uncomment that line). If you're not receiving these
|
||||
denied packets on your broadcast interface (IP address xxx.yyy.zzz.255
|
||||
typically), then someone is attacking your machine in particular. Be careful
|
||||
out there....
|
||||
|
||||
7.) Everything seems to work fine but my computer locks up after a while
|
||||
(and typically during a lengthy download through the cable modem)!
|
||||
|
||||
Solution -- You may need to add a short delay in the driver to 'slow down' the
|
||||
SURFboard because your PC might not be able to keep up with the transfer rate
|
||||
of the SB1000. To do this, it's probably best to download Franco's
|
||||
sb1000-1.1.2.tar.gz archive and build and install sb1000.o manually. You'll
|
||||
want to edit the 'Makefile' and look for the 'SB1000_DELAY'
|
||||
define. Uncomment those 'CFLAGS' lines (and comment out the default ones)
|
||||
and try setting the delay to something like 60 microseconds with:
|
||||
'-DSB1000_DELAY=60'. Then do `make' and as root `make install' and try
|
||||
it out. If it still doesn't work or you like playing with the driver, you may
|
||||
try other numbers. Remember though that the higher the delay, the slower the
|
||||
driver (which slows down the rest of the PC too when it is actively
|
||||
used). Thanks to Ed Daiga for this tip!
|
||||
|
||||
-----------------------------------------------------------------------------
|
||||
|
||||
Credits: This README came from Franco Venturi's original README file which is
|
||||
still supplied with his driver .tar.gz archive. I and all other sb1000 users
|
||||
owe Franco a tremendous "Thank you!" Additional thanks goes to Carl Patten
|
||||
and Ralph Bonnell who are now managing the Linux SB1000 web site, and to
|
||||
the SB1000 users who reported and helped debug the common problems listed
|
||||
above.
|
||||
|
||||
|
||||
Clemmitt Sigler
|
||||
csigler@vt.edu
|
||||
40
Documentation/networking/alias.txt
Normal file
40
Documentation/networking/alias.txt
Normal file
|
|
@ -0,0 +1,40 @@
|
|||
|
||||
IP-Aliasing:
|
||||
============
|
||||
|
||||
IP-aliases are an obsolete way to manage multiple IP-addresses/masks
|
||||
per interface. Newer tools such as iproute2 support multiple
|
||||
address/prefixes per interface, but aliases are still supported
|
||||
for backwards compatibility.
|
||||
|
||||
An alias is formed by adding a colon and a string when running ifconfig.
|
||||
This string is usually numeric, but this is not a must.
|
||||
|
||||
o Alias creation.
|
||||
Alias creation is done by 'magic' interface naming: eg. to create a
|
||||
200.1.1.1 alias for eth0 ...
|
||||
|
||||
# ifconfig eth0:0 200.1.1.1 etc,etc....
|
||||
~~ -> request alias #0 creation (if not yet exists) for eth0
|
||||
|
||||
The corresponding route is also set up by this command.
|
||||
Please note: The route always points to the base interface.
|
||||
|
||||
|
||||
o Alias deletion.
|
||||
The alias is removed by shutting the alias down:
|
||||
|
||||
# ifconfig eth0:0 down
|
||||
~~~~~~~~~~ -> will delete alias
|
||||
|
||||
|
||||
o Alias (re-)configuring
|
||||
|
||||
Aliases are not real devices, but programs should be able to configure and
|
||||
refer to them as usual (ifconfig, route, etc).
|
||||
|
||||
|
||||
o Relationship with main device
|
||||
|
||||
If the base device is shut down the added aliases will be deleted
|
||||
too.
|
||||
263
Documentation/networking/altera_tse.txt
Normal file
263
Documentation/networking/altera_tse.txt
Normal file
|
|
@ -0,0 +1,263 @@
|
|||
Altera Triple-Speed Ethernet MAC driver
|
||||
|
||||
Copyright (C) 2008-2014 Altera Corporation
|
||||
|
||||
This is the driver for the Altera Triple-Speed Ethernet (TSE) controllers
|
||||
using the SGDMA and MSGDMA soft DMA IP components. The driver uses the
|
||||
platform bus to obtain component resources. The designs used to test this
|
||||
driver were built for a Cyclone(R) V SOC FPGA board, a Cyclone(R) V FPGA board,
|
||||
and tested with ARM and NIOS processor hosts seperately. The anticipated use
|
||||
cases are simple communications between an embedded system and an external peer
|
||||
for status and simple configuration of the embedded system.
|
||||
|
||||
For more information visit www.altera.com and www.rocketboards.org. Support
|
||||
forums for the driver may be found on www.rocketboards.org, and a design used
|
||||
to test this driver may be found there as well. Support is also available from
|
||||
the maintainer of this driver, found in MAINTAINERS.
|
||||
|
||||
The Triple-Speed Ethernet, SGDMA, and MSGDMA components are all soft IP
|
||||
components that can be assembled and built into an FPGA using the Altera
|
||||
Quartus toolchain. Quartus 13.1 and 14.0 were used to build the design that
|
||||
this driver was tested against. The sopc2dts tool is used to create the
|
||||
device tree for the driver, and may be found at rocketboards.org.
|
||||
|
||||
The driver probe function examines the device tree and determines if the
|
||||
Triple-Speed Ethernet instance is using an SGDMA or MSGDMA component. The
|
||||
probe function then installs the appropriate set of DMA routines to
|
||||
initialize, setup transmits, receives, and interrupt handling primitives for
|
||||
the respective configurations.
|
||||
|
||||
The SGDMA component is to be deprecated in the near future (over the next 1-2
|
||||
years as of this writing in early 2014) in favor of the MSGDMA component.
|
||||
SGDMA support is included for existing designs and reference in case a
|
||||
developer wishes to support their own soft DMA logic and driver support. Any
|
||||
new designs should not use the SGDMA.
|
||||
|
||||
The SGDMA supports only a single transmit or receive operation at a time, and
|
||||
therefore will not perform as well compared to the MSGDMA soft IP. Please
|
||||
visit www.altera.com for known, documented SGDMA errata.
|
||||
|
||||
Scatter-gather DMA is not supported by the SGDMA or MSGDMA at this time.
|
||||
Scatter-gather DMA will be added to a future maintenance update to this
|
||||
driver.
|
||||
|
||||
Jumbo frames are not supported at this time.
|
||||
|
||||
The driver limits PHY operations to 10/100Mbps, and has not yet been fully
|
||||
tested for 1Gbps. This support will be added in a future maintenance update.
|
||||
|
||||
1) Kernel Configuration
|
||||
The kernel configuration option is ALTERA_TSE:
|
||||
Device Drivers ---> Network device support ---> Ethernet driver support --->
|
||||
Altera Triple-Speed Ethernet MAC support (ALTERA_TSE)
|
||||
|
||||
2) Driver parameters list:
|
||||
debug: message level (0: no output, 16: all);
|
||||
dma_rx_num: Number of descriptors in the RX list (default is 64);
|
||||
dma_tx_num: Number of descriptors in the TX list (default is 64).
|
||||
|
||||
3) Command line options
|
||||
Driver parameters can be also passed in command line by using:
|
||||
altera_tse=dma_rx_num:128,dma_tx_num:512
|
||||
|
||||
4) Driver information and notes
|
||||
|
||||
4.1) Transmit process
|
||||
When the driver's transmit routine is called by the kernel, it sets up a
|
||||
transmit descriptor by calling the underlying DMA transmit routine (SGDMA or
|
||||
MSGDMA), and initites a transmit operation. Once the transmit is complete, an
|
||||
interrupt is driven by the transmit DMA logic. The driver handles the transmit
|
||||
completion in the context of the interrupt handling chain by recycling
|
||||
resource required to send and track the requested transmit operation.
|
||||
|
||||
4.2) Receive process
|
||||
The driver will post receive buffers to the receive DMA logic during driver
|
||||
intialization. Receive buffers may or may not be queued depending upon the
|
||||
underlying DMA logic (MSGDMA is able queue receive buffers, SGDMA is not able
|
||||
to queue receive buffers to the SGDMA receive logic). When a packet is
|
||||
received, the DMA logic generates an interrupt. The driver handles a receive
|
||||
interrupt by obtaining the DMA receive logic status, reaping receive
|
||||
completions until no more receive completions are available.
|
||||
|
||||
4.3) Interrupt Mitigation
|
||||
The driver is able to mitigate the number of its DMA interrupts
|
||||
using NAPI for receive operations. Interrupt mitigation is not yet supported
|
||||
for transmit operations, but will be added in a future maintenance release.
|
||||
|
||||
4.4) Ethtool support
|
||||
Ethtool is supported. Driver statistics and internal errors can be taken using:
|
||||
ethtool -S ethX command. It is possible to dump registers etc.
|
||||
|
||||
4.5) PHY Support
|
||||
The driver is compatible with PAL to work with PHY and GPHY devices.
|
||||
|
||||
4.7) List of source files:
|
||||
o Kconfig
|
||||
o Makefile
|
||||
o altera_tse_main.c: main network device driver
|
||||
o altera_tse_ethtool.c: ethtool support
|
||||
o altera_tse.h: private driver structure and common definitions
|
||||
o altera_msgdma.h: MSGDMA implementation function definitions
|
||||
o altera_sgdma.h: SGDMA implementation function definitions
|
||||
o altera_msgdma.c: MSGDMA implementation
|
||||
o altera_sgdma.c: SGDMA implementation
|
||||
o altera_sgdmahw.h: SGDMA register and descriptor definitions
|
||||
o altera_msgdmahw.h: MSGDMA register and descriptor definitions
|
||||
o altera_utils.c: Driver utility functions
|
||||
o altera_utils.h: Driver utility function definitions
|
||||
|
||||
5) Debug Information
|
||||
|
||||
The driver exports debug information such as internal statistics,
|
||||
debug information, MAC and DMA registers etc.
|
||||
|
||||
A user may use the ethtool support to get statistics:
|
||||
e.g. using: ethtool -S ethX (that shows the statistics counters)
|
||||
or sees the MAC registers: e.g. using: ethtool -d ethX
|
||||
|
||||
The developer can also use the "debug" module parameter to get
|
||||
further debug information.
|
||||
|
||||
6) Statistics Support
|
||||
|
||||
The controller and driver support a mix of IEEE standard defined statistics,
|
||||
RFC defined statistics, and driver or Altera defined statistics. The four
|
||||
specifications containing the standard definitions for these statistics are
|
||||
as follows:
|
||||
|
||||
o IEEE 802.3-2012 - IEEE Standard for Ethernet.
|
||||
o RFC 2863 found at http://www.rfc-editor.org/rfc/rfc2863.txt.
|
||||
o RFC 2819 found at http://www.rfc-editor.org/rfc/rfc2819.txt.
|
||||
o Altera Triple Speed Ethernet User Guide, found at http://www.altera.com
|
||||
|
||||
The statistics supported by the TSE and the device driver are as follows:
|
||||
|
||||
"tx_packets" is equivalent to aFramesTransmittedOK defined in IEEE 802.3-2012,
|
||||
Section 5.2.2.1.2. This statistics is the count of frames that are successfully
|
||||
transmitted.
|
||||
|
||||
"rx_packets" is equivalent to aFramesReceivedOK defined in IEEE 802.3-2012,
|
||||
Section 5.2.2.1.5. This statistic is the count of frames that are successfully
|
||||
received. This count does not include any error packets such as CRC errors,
|
||||
length errors, or alignment errors.
|
||||
|
||||
"rx_crc_errors" is equivalent to aFrameCheckSequenceErrors defined in IEEE
|
||||
802.3-2012, Section 5.2.2.1.6. This statistic is the count of frames that are
|
||||
an integral number of bytes in length and do not pass the CRC test as the frame
|
||||
is received.
|
||||
|
||||
"rx_align_errors" is equivalent to aAlignmentErrors defined in IEEE 802.3-2012,
|
||||
Section 5.2.2.1.7. This statistic is the count of frames that are not an
|
||||
integral number of bytes in length and do not pass the CRC test as the frame is
|
||||
received.
|
||||
|
||||
"tx_bytes" is equivalent to aOctetsTransmittedOK defined in IEEE 802.3-2012,
|
||||
Section 5.2.2.1.8. This statistic is the count of data and pad bytes
|
||||
successfully transmitted from the interface.
|
||||
|
||||
"rx_bytes" is equivalent to aOctetsReceivedOK defined in IEEE 802.3-2012,
|
||||
Section 5.2.2.1.14. This statistic is the count of data and pad bytes
|
||||
successfully received by the controller.
|
||||
|
||||
"tx_pause" is equivalent to aPAUSEMACCtrlFramesTransmitted defined in IEEE
|
||||
802.3-2012, Section 30.3.4.2. This statistic is a count of PAUSE frames
|
||||
transmitted from the network controller.
|
||||
|
||||
"rx_pause" is equivalent to aPAUSEMACCtrlFramesReceived defined in IEEE
|
||||
802.3-2012, Section 30.3.4.3. This statistic is a count of PAUSE frames
|
||||
received by the network controller.
|
||||
|
||||
"rx_errors" is equivalent to ifInErrors defined in RFC 2863. This statistic is
|
||||
a count of the number of packets received containing errors that prevented the
|
||||
packet from being delivered to a higher level protocol.
|
||||
|
||||
"tx_errors" is equivalent to ifOutErrors defined in RFC 2863. This statistic
|
||||
is a count of the number of packets that could not be transmitted due to errors.
|
||||
|
||||
"rx_unicast" is equivalent to ifInUcastPkts defined in RFC 2863. This
|
||||
statistic is a count of the number of packets received that were not addressed
|
||||
to the broadcast address or a multicast group.
|
||||
|
||||
"rx_multicast" is equivalent to ifInMulticastPkts defined in RFC 2863. This
|
||||
statistic is a count of the number of packets received that were addressed to
|
||||
a multicast address group.
|
||||
|
||||
"rx_broadcast" is equivalent to ifInBroadcastPkts defined in RFC 2863. This
|
||||
statistic is a count of the number of packets received that were addressed to
|
||||
the broadcast address.
|
||||
|
||||
"tx_discards" is equivalent to ifOutDiscards defined in RFC 2863. This
|
||||
statistic is the number of outbound packets not transmitted even though an
|
||||
error was not detected. An example of a reason this might occur is to free up
|
||||
internal buffer space.
|
||||
|
||||
"tx_unicast" is equivalent to ifOutUcastPkts defined in RFC 2863. This
|
||||
statistic counts the number of packets transmitted that were not addressed to
|
||||
a multicast group or broadcast address.
|
||||
|
||||
"tx_multicast" is equivalent to ifOutMulticastPkts defined in RFC 2863. This
|
||||
statistic counts the number of packets transmitted that were addressed to a
|
||||
multicast group.
|
||||
|
||||
"tx_broadcast" is equivalent to ifOutBroadcastPkts defined in RFC 2863. This
|
||||
statistic counts the number of packets transmitted that were addressed to a
|
||||
broadcast address.
|
||||
|
||||
"ether_drops" is equivalent to etherStatsDropEvents defined in RFC 2819.
|
||||
This statistic counts the number of packets dropped due to lack of internal
|
||||
controller resources.
|
||||
|
||||
"rx_total_bytes" is equivalent to etherStatsOctets defined in RFC 2819.
|
||||
This statistic counts the total number of bytes received by the controller,
|
||||
including error and discarded packets.
|
||||
|
||||
"rx_total_packets" is equivalent to etherStatsPkts defined in RFC 2819.
|
||||
This statistic counts the total number of packets received by the controller,
|
||||
including error, discarded, unicast, multicast, and broadcast packets.
|
||||
|
||||
"rx_undersize" is equivalent to etherStatsUndersizePkts defined in RFC 2819.
|
||||
This statistic counts the number of correctly formed packets received less
|
||||
than 64 bytes long.
|
||||
|
||||
"rx_oversize" is equivalent to etherStatsOversizePkts defined in RFC 2819.
|
||||
This statistic counts the number of correctly formed packets greater than 1518
|
||||
bytes long.
|
||||
|
||||
"rx_64_bytes" is equivalent to etherStatsPkts64Octets defined in RFC 2819.
|
||||
This statistic counts the total number of packets received that were 64 octets
|
||||
in length.
|
||||
|
||||
"rx_65_127_bytes" is equivalent to etherStatsPkts65to127Octets defined in RFC
|
||||
2819. This statistic counts the total number of packets received that were
|
||||
between 65 and 127 octets in length inclusive.
|
||||
|
||||
"rx_128_255_bytes" is equivalent to etherStatsPkts128to255Octets defined in
|
||||
RFC 2819. This statistic is the total number of packets received that were
|
||||
between 128 and 255 octets in length inclusive.
|
||||
|
||||
"rx_256_511_bytes" is equivalent to etherStatsPkts256to511Octets defined in
|
||||
RFC 2819. This statistic is the total number of packets received that were
|
||||
between 256 and 511 octets in length inclusive.
|
||||
|
||||
"rx_512_1023_bytes" is equivalent to etherStatsPkts512to1023Octets defined in
|
||||
RFC 2819. This statistic is the total number of packets received that were
|
||||
between 512 and 1023 octets in length inclusive.
|
||||
|
||||
"rx_1024_1518_bytes" is equivalent to etherStatsPkts1024to1518Octets define
|
||||
in RFC 2819. This statistic is the total number of packets received that were
|
||||
between 1024 and 1518 octets in length inclusive.
|
||||
|
||||
"rx_gte_1519_bytes" is a statistic defined specific to the behavior of the
|
||||
Altera TSE. This statistics counts the number of received good and errored
|
||||
frames between the length of 1519 and the maximum frame length configured
|
||||
in the frm_length register. See the Altera TSE User Guide for More details.
|
||||
|
||||
"rx_jabbers" is equivalent to etherStatsJabbers defined in RFC 2819. This
|
||||
statistic is the total number of packets received that were longer than 1518
|
||||
octets, and had either a bad CRC with an integral number of octets (CRC Error)
|
||||
or a bad CRC with a non-integral number of octets (Alignment Error).
|
||||
|
||||
"rx_runts" is equivalent to etherStatsFragments defined in RFC 2819. This
|
||||
statistic is the total number of packets received that were less than 64 octets
|
||||
in length and had either a bad CRC with an integral number of octets (CRC
|
||||
error) or a bad CRC with a non-integral number of octets (Alignment Error).
|
||||
3133
Documentation/networking/arcnet-hardware.txt
Normal file
3133
Documentation/networking/arcnet-hardware.txt
Normal file
File diff suppressed because it is too large
Load diff
556
Documentation/networking/arcnet.txt
Normal file
556
Documentation/networking/arcnet.txt
Normal file
|
|
@ -0,0 +1,556 @@
|
|||
----------------------------------------------------------------------------
|
||||
NOTE: See also arcnet-hardware.txt in this directory for jumper-setting
|
||||
and cabling information if you're like many of us and didn't happen to get a
|
||||
manual with your ARCnet card.
|
||||
----------------------------------------------------------------------------
|
||||
|
||||
Since no one seems to listen to me otherwise, perhaps a poem will get your
|
||||
attention:
|
||||
This driver's getting fat and beefy,
|
||||
But my cat is still named Fifi.
|
||||
|
||||
Hmm, I think I'm allowed to call that a poem, even though it's only two
|
||||
lines. Hey, I'm in Computer Science, not English. Give me a break.
|
||||
|
||||
The point is: I REALLY REALLY REALLY REALLY REALLY want to hear from you if
|
||||
you test this and get it working. Or if you don't. Or anything.
|
||||
|
||||
ARCnet 0.32 ALPHA first made it into the Linux kernel 1.1.80 - this was
|
||||
nice, but after that even FEWER people started writing to me because they
|
||||
didn't even have to install the patch. <sigh>
|
||||
|
||||
Come on, be a sport! Send me a success report!
|
||||
|
||||
(hey, that was even better than my original poem... this is getting bad!)
|
||||
|
||||
|
||||
--------
|
||||
WARNING:
|
||||
--------
|
||||
|
||||
If you don't e-mail me about your success/failure soon, I may be forced to
|
||||
start SINGING. And we don't want that, do we?
|
||||
|
||||
(You know, it might be argued that I'm pushing this point a little too much.
|
||||
If you think so, why not flame me in a quick little e-mail? Please also
|
||||
include the type of card(s) you're using, software, size of network, and
|
||||
whether it's working or not.)
|
||||
|
||||
My e-mail address is: apenwarr@worldvisions.ca
|
||||
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
|
||||
These are the ARCnet drivers for Linux.
|
||||
|
||||
|
||||
This new release (2.91) has been put together by David Woodhouse
|
||||
<dwmw2@infradead.org>, in an attempt to tidy up the driver after adding support
|
||||
for yet another chipset. Now the generic support has been separated from the
|
||||
individual chipset drivers, and the source files aren't quite so packed with
|
||||
#ifdefs! I've changed this file a bit, but kept it in the first person from
|
||||
Avery, because I didn't want to completely rewrite it.
|
||||
|
||||
The previous release resulted from many months of on-and-off effort from me
|
||||
(Avery Pennarun), many bug reports/fixes and suggestions from others, and in
|
||||
particular a lot of input and coding from Tomasz Motylewski. Starting with
|
||||
ARCnet 2.10 ALPHA, Tomasz's all-new-and-improved RFC1051 support has been
|
||||
included and seems to be working fine!
|
||||
|
||||
|
||||
Where do I discuss these drivers?
|
||||
---------------------------------
|
||||
|
||||
Tomasz has been so kind as to set up a new and improved mailing list.
|
||||
Subscribe by sending a message with the BODY "subscribe linux-arcnet YOUR
|
||||
REAL NAME" to listserv@tichy.ch.uj.edu.pl. Then, to submit messages to the
|
||||
list, mail to linux-arcnet@tichy.ch.uj.edu.pl.
|
||||
|
||||
There are archives of the mailing list at:
|
||||
http://epistolary.org/mailman/listinfo.cgi/arcnet
|
||||
|
||||
The people on linux-net@vger.kernel.org (now defunct, replaced by
|
||||
netdev@vger.kernel.org) have also been known to be very helpful, especially
|
||||
when we're talking about ALPHA Linux kernels that may or may not work right
|
||||
in the first place.
|
||||
|
||||
|
||||
Other Drivers and Info
|
||||
----------------------
|
||||
|
||||
You can try my ARCNET page on the World Wide Web at:
|
||||
http://www.qis.net/~jschmitz/arcnet/
|
||||
|
||||
Also, SMC (one of the companies that makes ARCnet cards) has a WWW site you
|
||||
might be interested in, which includes several drivers for various cards
|
||||
including ARCnet. Try:
|
||||
http://www.smc.com/
|
||||
|
||||
Performance Technologies makes various network software that supports
|
||||
ARCnet:
|
||||
http://www.perftech.com/ or ftp to ftp.perftech.com.
|
||||
|
||||
Novell makes a networking stack for DOS which includes ARCnet drivers. Try
|
||||
FTPing to ftp.novell.com.
|
||||
|
||||
You can get the Crynwr packet driver collection (including arcether.com, the
|
||||
one you'll want to use with ARCnet cards) from
|
||||
oak.oakland.edu:/simtel/msdos/pktdrvr. It won't work perfectly on a 386+
|
||||
without patches, though, and also doesn't like several cards. Fixed
|
||||
versions are available on my WWW page, or via e-mail if you don't have WWW
|
||||
access.
|
||||
|
||||
|
||||
Installing the Driver
|
||||
---------------------
|
||||
|
||||
All you will need to do in order to install the driver is:
|
||||
make config
|
||||
(be sure to choose ARCnet in the network devices
|
||||
and at least one chipset driver.)
|
||||
make clean
|
||||
make zImage
|
||||
|
||||
If you obtained this ARCnet package as an upgrade to the ARCnet driver in
|
||||
your current kernel, you will need to first copy arcnet.c over the one in
|
||||
the linux/drivers/net directory.
|
||||
|
||||
You will know the driver is installed properly if you get some ARCnet
|
||||
messages when you reboot into the new Linux kernel.
|
||||
|
||||
There are four chipset options:
|
||||
|
||||
1. Standard ARCnet COM90xx chipset.
|
||||
|
||||
This is the normal ARCnet card, which you've probably got. This is the only
|
||||
chipset driver which will autoprobe if not told where the card is.
|
||||
It following options on the command line:
|
||||
com90xx=[<io>[,<irq>[,<shmem>]]][,<name>] | <name>
|
||||
|
||||
If you load the chipset support as a module, the options are:
|
||||
io=<io> irq=<irq> shmem=<shmem> device=<name>
|
||||
|
||||
To disable the autoprobe, just specify "com90xx=" on the kernel command line.
|
||||
To specify the name alone, but allow autoprobe, just put "com90xx=<name>"
|
||||
|
||||
2. ARCnet COM20020 chipset.
|
||||
|
||||
This is the new chipset from SMC with support for promiscuous mode (packet
|
||||
sniffing), extra diagnostic information, etc. Unfortunately, there is no
|
||||
sensible method of autoprobing for these cards. You must specify the I/O
|
||||
address on the kernel command line.
|
||||
The command line options are:
|
||||
com20020=<io>[,<irq>[,<node_ID>[,backplane[,CKP[,timeout]]]]][,name]
|
||||
|
||||
If you load the chipset support as a module, the options are:
|
||||
io=<io> irq=<irq> node=<node_ID> backplane=<backplane> clock=<CKP>
|
||||
timeout=<timeout> device=<name>
|
||||
|
||||
The COM20020 chipset allows you to set the node ID in software, overriding the
|
||||
default which is still set in DIP switches on the card. If you don't have the
|
||||
COM20020 data sheets, and you don't know what the other three options refer
|
||||
to, then they won't interest you - forget them.
|
||||
|
||||
3. ARCnet COM90xx chipset in IO-mapped mode.
|
||||
|
||||
This will also work with the normal ARCnet cards, but doesn't use the shared
|
||||
memory. It performs less well than the above driver, but is provided in case
|
||||
you have a card which doesn't support shared memory, or (strangely) in case
|
||||
you have so many ARCnet cards in your machine that you run out of shmem slots.
|
||||
If you don't give the IO address on the kernel command line, then the driver
|
||||
will not find the card.
|
||||
The command line options are:
|
||||
com90io=<io>[,<irq>][,<name>]
|
||||
|
||||
If you load the chipset support as a module, the options are:
|
||||
io=<io> irq=<irq> device=<name>
|
||||
|
||||
4. ARCnet RIM I cards.
|
||||
|
||||
These are COM90xx chips which are _completely_ memory mapped. The support for
|
||||
these is not tested. If you have one, please mail the author with a success
|
||||
report. All options must be specified, except the device name.
|
||||
Command line options:
|
||||
arcrimi=<shmem>,<irq>,<node_ID>[,<name>]
|
||||
|
||||
If you load the chipset support as a module, the options are:
|
||||
shmem=<shmem> irq=<irq> node=<node_ID> device=<name>
|
||||
|
||||
|
||||
Loadable Module Support
|
||||
-----------------------
|
||||
|
||||
Configure and rebuild Linux. When asked, answer 'm' to "Generic ARCnet
|
||||
support" and to support for your ARCnet chipset if you want to use the
|
||||
loadable module. You can also say 'y' to "Generic ARCnet support" and 'm'
|
||||
to the chipset support if you wish.
|
||||
|
||||
make config
|
||||
make clean
|
||||
make zImage
|
||||
make modules
|
||||
|
||||
If you're using a loadable module, you need to use insmod to load it, and
|
||||
you can specify various characteristics of your card on the command
|
||||
line. (In recent versions of the driver, autoprobing is much more reliable
|
||||
and works as a module, so most of this is now unnecessary.)
|
||||
|
||||
For example:
|
||||
cd /usr/src/linux/modules
|
||||
insmod arcnet.o
|
||||
insmod com90xx.o
|
||||
insmod com20020.o io=0x2e0 device=eth1
|
||||
|
||||
|
||||
Using the Driver
|
||||
----------------
|
||||
|
||||
If you build your kernel with ARCnet COM90xx support included, it should
|
||||
probe for your card automatically when you boot. If you use a different
|
||||
chipset driver complied into the kernel, you must give the necessary options
|
||||
on the kernel command line, as detailed above.
|
||||
|
||||
Go read the NET-2-HOWTO and ETHERNET-HOWTO for Linux; they should be
|
||||
available where you picked up this driver. Think of your ARCnet as a
|
||||
souped-up (or down, as the case may be) Ethernet card.
|
||||
|
||||
By the way, be sure to change all references from "eth0" to "arc0" in the
|
||||
HOWTOs. Remember that ARCnet isn't a "true" Ethernet, and the device name
|
||||
is DIFFERENT.
|
||||
|
||||
|
||||
Multiple Cards in One Computer
|
||||
------------------------------
|
||||
|
||||
Linux has pretty good support for this now, but since I've been busy, the
|
||||
ARCnet driver has somewhat suffered in this respect. COM90xx support, if
|
||||
compiled into the kernel, will (try to) autodetect all the installed cards.
|
||||
|
||||
If you have other cards, with support compiled into the kernel, then you can
|
||||
just repeat the options on the kernel command line, e.g.:
|
||||
LILO: linux com20020=0x2e0 com20020=0x380 com90io=0x260
|
||||
|
||||
If you have the chipset support built as a loadable module, then you need to
|
||||
do something like this:
|
||||
insmod -o arc0 com90xx
|
||||
insmod -o arc1 com20020 io=0x2e0
|
||||
insmod -o arc2 com90xx
|
||||
The ARCnet drivers will now sort out their names automatically.
|
||||
|
||||
|
||||
How do I get it to work with...?
|
||||
--------------------------------
|
||||
|
||||
NFS: Should be fine linux->linux, just pretend you're using Ethernet cards.
|
||||
oak.oakland.edu:/simtel/msdos/nfs has some nice DOS clients. There
|
||||
is also a DOS-based NFS server called SOSS. It doesn't multitask
|
||||
quite the way Linux does (actually, it doesn't multitask AT ALL) but
|
||||
you never know what you might need.
|
||||
|
||||
With AmiTCP (and possibly others), you may need to set the following
|
||||
options in your Amiga nfstab: MD 1024 MR 1024 MW 1024
|
||||
(Thanks to Christian Gottschling <ferksy@indigo.tng.oche.de>
|
||||
for this.)
|
||||
|
||||
Probably these refer to maximum NFS data/read/write block sizes. I
|
||||
don't know why the defaults on the Amiga didn't work; write to me if
|
||||
you know more.
|
||||
|
||||
DOS: If you're using the freeware arcether.com, you might want to install
|
||||
the driver patch from my web page. It helps with PC/TCP, and also
|
||||
can get arcether to load if it timed out too quickly during
|
||||
initialization. In fact, if you use it on a 386+ you REALLY need
|
||||
the patch, really.
|
||||
|
||||
Windows: See DOS :) Trumpet Winsock works fine with either the Novell or
|
||||
Arcether client, assuming you remember to load winpkt of course.
|
||||
|
||||
LAN Manager and Windows for Workgroups: These programs use protocols that
|
||||
are incompatible with the Internet standard. They try to pretend
|
||||
the cards are Ethernet, and confuse everyone else on the network.
|
||||
|
||||
However, v2.00 and higher of the Linux ARCnet driver supports this
|
||||
protocol via the 'arc0e' device. See the section on "Multiprotocol
|
||||
Support" for more information.
|
||||
|
||||
Using the freeware Samba server and clients for Linux, you can now
|
||||
interface quite nicely with TCP/IP-based WfWg or Lan Manager
|
||||
networks.
|
||||
|
||||
Windows 95: Tools are included with Win95 that let you use either the LANMAN
|
||||
style network drivers (NDIS) or Novell drivers (ODI) to handle your
|
||||
ARCnet packets. If you use ODI, you'll need to use the 'arc0'
|
||||
device with Linux. If you use NDIS, then try the 'arc0e' device.
|
||||
See the "Multiprotocol Support" section below if you need arc0e,
|
||||
you're completely insane, and/or you need to build some kind of
|
||||
hybrid network that uses both encapsulation types.
|
||||
|
||||
OS/2: I've been told it works under Warp Connect with an ARCnet driver from
|
||||
SMC. You need to use the 'arc0e' interface for this. If you get
|
||||
the SMC driver to work with the TCP/IP stuff included in the
|
||||
"normal" Warp Bonus Pack, let me know.
|
||||
|
||||
ftp.microsoft.com also has a freeware "Lan Manager for OS/2" client
|
||||
which should use the same protocol as WfWg does. I had no luck
|
||||
installing it under Warp, however. Please mail me with any results.
|
||||
|
||||
NetBSD/AmiTCP: These use an old version of the Internet standard ARCnet
|
||||
protocol (RFC1051) which is compatible with the Linux driver v2.10
|
||||
ALPHA and above using the arc0s device. (See "Multiprotocol ARCnet"
|
||||
below.) ** Newer versions of NetBSD apparently support RFC1201.
|
||||
|
||||
|
||||
Using Multiprotocol ARCnet
|
||||
--------------------------
|
||||
|
||||
The ARCnet driver v2.10 ALPHA supports three protocols, each on its own
|
||||
"virtual network device":
|
||||
|
||||
arc0 - RFC1201 protocol, the official Internet standard which just
|
||||
happens to be 100% compatible with Novell's TRXNET driver.
|
||||
Version 1.00 of the ARCnet driver supported _only_ this
|
||||
protocol. arc0 is the fastest of the three protocols (for
|
||||
whatever reason), and allows larger packets to be used
|
||||
because it supports RFC1201 "packet splitting" operations.
|
||||
Unless you have a specific need to use a different protocol,
|
||||
I strongly suggest that you stick with this one.
|
||||
|
||||
arc0e - "Ethernet-Encapsulation" which sends packets over ARCnet
|
||||
that are actually a lot like Ethernet packets, including the
|
||||
6-byte hardware addresses. This protocol is compatible with
|
||||
Microsoft's NDIS ARCnet driver, like the one in WfWg and
|
||||
LANMAN. Because the MTU of 493 is actually smaller than the
|
||||
one "required" by TCP/IP (576), there is a chance that some
|
||||
network operations will not function properly. The Linux
|
||||
TCP/IP layer can compensate in most cases, however, by
|
||||
automatically fragmenting the TCP/IP packets to make them
|
||||
fit. arc0e also works slightly more slowly than arc0, for
|
||||
reasons yet to be determined. (Probably it's the smaller
|
||||
MTU that does it.)
|
||||
|
||||
arc0s - The "[s]imple" RFC1051 protocol is the "previous" Internet
|
||||
standard that is completely incompatible with the new
|
||||
standard. Some software today, however, continues to
|
||||
support the old standard (and only the old standard)
|
||||
including NetBSD and AmiTCP. RFC1051 also does not support
|
||||
RFC1201's packet splitting, and the MTU of 507 is still
|
||||
smaller than the Internet "requirement," so it's quite
|
||||
possible that you may run into problems. It's also slower
|
||||
than RFC1201 by about 25%, for the same reason as arc0e.
|
||||
|
||||
The arc0s support was contributed by Tomasz Motylewski
|
||||
and modified somewhat by me. Bugs are probably my fault.
|
||||
|
||||
You can choose not to compile arc0e and arc0s into the driver if you want -
|
||||
this will save you a bit of memory and avoid confusion when eg. trying to
|
||||
use the "NFS-root" stuff in recent Linux kernels.
|
||||
|
||||
The arc0e and arc0s devices are created automatically when you first
|
||||
ifconfig the arc0 device. To actually use them, though, you need to also
|
||||
ifconfig the other virtual devices you need. There are a number of ways you
|
||||
can set up your network then:
|
||||
|
||||
|
||||
1. Single Protocol.
|
||||
|
||||
This is the simplest way to configure your network: use just one of the
|
||||
two available protocols. As mentioned above, it's a good idea to use
|
||||
only arc0 unless you have a good reason (like some other software, ie.
|
||||
WfWg, that only works with arc0e).
|
||||
|
||||
If you need only arc0, then the following commands should get you going:
|
||||
ifconfig arc0 MY.IP.ADD.RESS
|
||||
route add MY.IP.ADD.RESS arc0
|
||||
route add -net SUB.NET.ADD.RESS arc0
|
||||
[add other local routes here]
|
||||
|
||||
If you need arc0e (and only arc0e), it's a little different:
|
||||
ifconfig arc0 MY.IP.ADD.RESS
|
||||
ifconfig arc0e MY.IP.ADD.RESS
|
||||
route add MY.IP.ADD.RESS arc0e
|
||||
route add -net SUB.NET.ADD.RESS arc0e
|
||||
|
||||
arc0s works much the same way as arc0e.
|
||||
|
||||
|
||||
2. More than one protocol on the same wire.
|
||||
|
||||
Now things start getting confusing. To even try it, you may need to be
|
||||
partly crazy. Here's what *I* did. :) Note that I don't include arc0s in
|
||||
my home network; I don't have any NetBSD or AmiTCP computers, so I only
|
||||
use arc0s during limited testing.
|
||||
|
||||
I have three computers on my home network; two Linux boxes (which prefer
|
||||
RFC1201 protocol, for reasons listed above), and one XT that can't run
|
||||
Linux but runs the free Microsoft LANMAN Client instead.
|
||||
|
||||
Worse, one of the Linux computers (freedom) also has a modem and acts as
|
||||
a router to my Internet provider. The other Linux box (insight) also has
|
||||
its own IP address and needs to use freedom as its default gateway. The
|
||||
XT (patience), however, does not have its own Internet IP address and so
|
||||
I assigned it one on a "private subnet" (as defined by RFC1597).
|
||||
|
||||
To start with, take a simple network with just insight and freedom.
|
||||
Insight needs to:
|
||||
- talk to freedom via RFC1201 (arc0) protocol, because I like it
|
||||
more and it's faster.
|
||||
- use freedom as its Internet gateway.
|
||||
|
||||
That's pretty easy to do. Set up insight like this:
|
||||
ifconfig arc0 insight
|
||||
route add insight arc0
|
||||
route add freedom arc0 /* I would use the subnet here (like I said
|
||||
to to in "single protocol" above),
|
||||
but the rest of the subnet
|
||||
unfortunately lies across the PPP
|
||||
link on freedom, which confuses
|
||||
things. */
|
||||
route add default gw freedom
|
||||
|
||||
And freedom gets configured like so:
|
||||
ifconfig arc0 freedom
|
||||
route add freedom arc0
|
||||
route add insight arc0
|
||||
/* and default gateway is configured by pppd */
|
||||
|
||||
Great, now insight talks to freedom directly on arc0, and sends packets
|
||||
to the Internet through freedom. If you didn't know how to do the above,
|
||||
you should probably stop reading this section now because it only gets
|
||||
worse.
|
||||
|
||||
Now, how do I add patience into the network? It will be using LANMAN
|
||||
Client, which means I need the arc0e device. It needs to be able to talk
|
||||
to both insight and freedom, and also use freedom as a gateway to the
|
||||
Internet. (Recall that patience has a "private IP address" which won't
|
||||
work on the Internet; that's okay, I configured Linux IP masquerading on
|
||||
freedom for this subnet).
|
||||
|
||||
So patience (necessarily; I don't have another IP number from my
|
||||
provider) has an IP address on a different subnet than freedom and
|
||||
insight, but needs to use freedom as an Internet gateway. Worse, most
|
||||
DOS networking programs, including LANMAN, have braindead networking
|
||||
schemes that rely completely on the netmask and a 'default gateway' to
|
||||
determine how to route packets. This means that to get to freedom or
|
||||
insight, patience WILL send through its default gateway, regardless of
|
||||
the fact that both freedom and insight (courtesy of the arc0e device)
|
||||
could understand a direct transmission.
|
||||
|
||||
I compensate by giving freedom an extra IP address - aliased 'gatekeeper'
|
||||
- that is on my private subnet, the same subnet that patience is on. I
|
||||
then define gatekeeper to be the default gateway for patience.
|
||||
|
||||
To configure freedom (in addition to the commands above):
|
||||
ifconfig arc0e gatekeeper
|
||||
route add gatekeeper arc0e
|
||||
route add patience arc0e
|
||||
|
||||
This way, freedom will send all packets for patience through arc0e,
|
||||
giving its IP address as gatekeeper (on the private subnet). When it
|
||||
talks to insight or the Internet, it will use its "freedom" Internet IP
|
||||
address.
|
||||
|
||||
You will notice that we haven't configured the arc0e device on insight.
|
||||
This would work, but is not really necessary, and would require me to
|
||||
assign insight another special IP number from my private subnet. Since
|
||||
both insight and patience are using freedom as their default gateway, the
|
||||
two can already talk to each other.
|
||||
|
||||
It's quite fortunate that I set things up like this the first time (cough
|
||||
cough) because it's really handy when I boot insight into DOS. There, it
|
||||
runs the Novell ODI protocol stack, which only works with RFC1201 ARCnet.
|
||||
In this mode it would be impossible for insight to communicate directly
|
||||
with patience, since the Novell stack is incompatible with Microsoft's
|
||||
Ethernet-Encap. Without changing any settings on freedom or patience, I
|
||||
simply set freedom as the default gateway for insight (now in DOS,
|
||||
remember) and all the forwarding happens "automagically" between the two
|
||||
hosts that would normally not be able to communicate at all.
|
||||
|
||||
For those who like diagrams, I have created two "virtual subnets" on the
|
||||
same physical ARCnet wire. You can picture it like this:
|
||||
|
||||
|
||||
[RFC1201 NETWORK] [ETHER-ENCAP NETWORK]
|
||||
(registered Internet subnet) (RFC1597 private subnet)
|
||||
|
||||
(IP Masquerade)
|
||||
/---------------\ * /---------------\
|
||||
| | * | |
|
||||
| +-Freedom-*-Gatekeeper-+ |
|
||||
| | | * | |
|
||||
\-------+-------/ | * \-------+-------/
|
||||
| | |
|
||||
Insight | Patience
|
||||
(Internet)
|
||||
|
||||
|
||||
|
||||
It works: what now?
|
||||
-------------------
|
||||
|
||||
Send mail describing your setup, preferably including driver version, kernel
|
||||
version, ARCnet card model, CPU type, number of systems on your network, and
|
||||
list of software in use to me at the following address:
|
||||
apenwarr@worldvisions.ca
|
||||
|
||||
I do send (sometimes automated) replies to all messages I receive. My email
|
||||
can be weird (and also usually gets forwarded all over the place along the
|
||||
way to me), so if you don't get a reply within a reasonable time, please
|
||||
resend.
|
||||
|
||||
|
||||
It doesn't work: what now?
|
||||
--------------------------
|
||||
|
||||
Do the same as above, but also include the output of the ifconfig and route
|
||||
commands, as well as any pertinent log entries (ie. anything that starts
|
||||
with "arcnet:" and has shown up since the last reboot) in your mail.
|
||||
|
||||
If you want to try fixing it yourself (I strongly recommend that you mail me
|
||||
about the problem first, since it might already have been solved) you may
|
||||
want to try some of the debug levels available. For heavy testing on
|
||||
D_DURING or more, it would be a REALLY good idea to kill your klogd daemon
|
||||
first! D_DURING displays 4-5 lines for each packet sent or received. D_TX,
|
||||
D_RX, and D_SKB actually DISPLAY each packet as it is sent or received,
|
||||
which is obviously quite big.
|
||||
|
||||
Starting with v2.40 ALPHA, the autoprobe routines have changed
|
||||
significantly. In particular, they won't tell you why the card was not
|
||||
found unless you turn on the D_INIT_REASONS debugging flag.
|
||||
|
||||
Once the driver is running, you can run the arcdump shell script (available
|
||||
from me or in the full ARCnet package, if you have it) as root to list the
|
||||
contents of the arcnet buffers at any time. To make any sense at all out of
|
||||
this, you should grab the pertinent RFCs. (some are listed near the top of
|
||||
arcnet.c). arcdump assumes your card is at 0xD0000. If it isn't, edit the
|
||||
script.
|
||||
|
||||
Buffers 0 and 1 are used for receiving, and Buffers 2 and 3 are for sending.
|
||||
Ping-pong buffers are implemented both ways.
|
||||
|
||||
If your debug level includes D_DURING and you did NOT define SLOW_XMIT_COPY,
|
||||
the buffers are cleared to a constant value of 0x42 every time the card is
|
||||
reset (which should only happen when you do an ifconfig up, or when Linux
|
||||
decides that the driver is broken). During a transmit, unused parts of the
|
||||
buffer will be cleared to 0x42 as well. This is to make it easier to figure
|
||||
out which bytes are being used by a packet.
|
||||
|
||||
You can change the debug level without recompiling the kernel by typing:
|
||||
ifconfig arc0 down metric 1xxx
|
||||
/etc/rc.d/rc.inet1
|
||||
where "xxx" is the debug level you want. For example, "metric 1015" would put
|
||||
you at debug level 15. Debug level 7 is currently the default.
|
||||
|
||||
Note that the debug level is (starting with v1.90 ALPHA) a binary
|
||||
combination of different debug flags; so debug level 7 is really 1+2+4 or
|
||||
D_NORMAL+D_EXTRA+D_INIT. To include D_DURING, you would add 16 to this,
|
||||
resulting in debug level 23.
|
||||
|
||||
If you don't understand that, you probably don't want to know anyway.
|
||||
E-mail me about your problem.
|
||||
|
||||
|
||||
I want to send money: what now?
|
||||
-------------------------------
|
||||
|
||||
Go take a nap or something. You'll feel better in the morning.
|
||||
8
Documentation/networking/atm.txt
Normal file
8
Documentation/networking/atm.txt
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
In order to use anything but the most primitive functions of ATM,
|
||||
several user-mode programs are required to assist the kernel. These
|
||||
programs and related material can be found via the ATM on Linux Web
|
||||
page at http://linux-atm.sourceforge.net/
|
||||
|
||||
If you encounter problems with ATM, please report them on the ATM
|
||||
on Linux mailing list. Subscription information, archives, etc.,
|
||||
can be found on http://linux-atm.sourceforge.net/
|
||||
10
Documentation/networking/ax25.txt
Normal file
10
Documentation/networking/ax25.txt
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
To use the amateur radio protocols within Linux you will need to get a
|
||||
suitable copy of the AX.25 Utilities. More detailed information about
|
||||
AX.25, NET/ROM and ROSE, associated programs and and utilities can be
|
||||
found on http://www.linux-ax25.org.
|
||||
|
||||
There is an active mailing list for discussing Linux amateur radio matters
|
||||
called linux-hams@vger.kernel.org. To subscribe to it, send a message to
|
||||
majordomo@vger.kernel.org with the words "subscribe linux-hams" in the body
|
||||
of the message, the subject field is ignored. You don't need to be
|
||||
subscribed to post but of course that means you might miss an answer.
|
||||
202
Documentation/networking/batman-adv.txt
Normal file
202
Documentation/networking/batman-adv.txt
Normal file
|
|
@ -0,0 +1,202 @@
|
|||
BATMAN-ADV
|
||||
----------
|
||||
|
||||
Batman advanced is a new approach to wireless networking which
|
||||
does no longer operate on the IP basis. Unlike the batman daemon,
|
||||
which exchanges information using UDP packets and sets routing
|
||||
tables, batman-advanced operates on ISO/OSI Layer 2 only and uses
|
||||
and routes (or better: bridges) Ethernet Frames. It emulates a
|
||||
virtual network switch of all nodes participating. Therefore all
|
||||
nodes appear to be link local, thus all higher operating proto-
|
||||
cols won't be affected by any changes within the network. You can
|
||||
run almost any protocol above batman advanced, prominent examples
|
||||
are: IPv4, IPv6, DHCP, IPX.
|
||||
|
||||
Batman advanced was implemented as a Linux kernel driver to re-
|
||||
duce the overhead to a minimum. It does not depend on any (other)
|
||||
network driver, and can be used on wifi as well as ethernet lan,
|
||||
vpn, etc ... (anything with ethernet-style layer 2).
|
||||
|
||||
|
||||
CONFIGURATION
|
||||
-------------
|
||||
|
||||
Load the batman-adv module into your kernel:
|
||||
|
||||
# insmod batman-adv.ko
|
||||
|
||||
The module is now waiting for activation. You must add some in-
|
||||
terfaces on which batman can operate. After loading the module
|
||||
batman advanced will scan your systems interfaces to search for
|
||||
compatible interfaces. Once found, it will create subfolders in
|
||||
the /sys directories of each supported interface, e.g.
|
||||
|
||||
# ls /sys/class/net/eth0/batman_adv/
|
||||
# iface_status mesh_iface
|
||||
|
||||
If an interface does not have the "batman_adv" subfolder it prob-
|
||||
ably is not supported. Not supported interfaces are: loopback,
|
||||
non-ethernet and batman's own interfaces.
|
||||
|
||||
Note: After the module was loaded it will continuously watch for
|
||||
new interfaces to verify the compatibility. There is no need to
|
||||
reload the module if you plug your USB wifi adapter into your ma-
|
||||
chine after batman advanced was initially loaded.
|
||||
|
||||
To activate a given interface simply write "bat0" into its
|
||||
"mesh_iface" file inside the batman_adv subfolder:
|
||||
|
||||
# echo bat0 > /sys/class/net/eth0/batman_adv/mesh_iface
|
||||
|
||||
Repeat this step for all interfaces you wish to add. Now batman
|
||||
starts using/broadcasting on this/these interface(s).
|
||||
|
||||
By reading the "iface_status" file you can check its status:
|
||||
|
||||
# cat /sys/class/net/eth0/batman_adv/iface_status
|
||||
# active
|
||||
|
||||
To deactivate an interface you have to write "none" into its
|
||||
"mesh_iface" file:
|
||||
|
||||
# echo none > /sys/class/net/eth0/batman_adv/mesh_iface
|
||||
|
||||
|
||||
All mesh wide settings can be found in batman's own interface
|
||||
folder:
|
||||
|
||||
# ls /sys/class/net/bat0/mesh/
|
||||
#aggregated_ogms distributed_arp_table gw_sel_class orig_interval
|
||||
#ap_isolation fragmentation hop_penalty routing_algo
|
||||
#bonding gw_bandwidth isolation_mark vlan0
|
||||
#bridge_loop_avoidance gw_mode log_level
|
||||
|
||||
There is a special folder for debugging information:
|
||||
|
||||
# ls /sys/kernel/debug/batman_adv/bat0/
|
||||
# bla_backbone_table log transtable_global
|
||||
# bla_claim_table originators transtable_local
|
||||
# gateways socket
|
||||
|
||||
Some of the files contain all sort of status information regard-
|
||||
ing the mesh network. For example, you can view the table of
|
||||
originators (mesh participants) with:
|
||||
|
||||
# cat /sys/kernel/debug/batman_adv/bat0/originators
|
||||
|
||||
Other files allow to change batman's behaviour to better fit your
|
||||
requirements. For instance, you can check the current originator
|
||||
interval (value in milliseconds which determines how often batman
|
||||
sends its broadcast packets):
|
||||
|
||||
# cat /sys/class/net/bat0/mesh/orig_interval
|
||||
# 1000
|
||||
|
||||
and also change its value:
|
||||
|
||||
# echo 3000 > /sys/class/net/bat0/mesh/orig_interval
|
||||
|
||||
In very mobile scenarios, you might want to adjust the originator
|
||||
interval to a lower value. This will make the mesh more respon-
|
||||
sive to topology changes, but will also increase the overhead.
|
||||
|
||||
|
||||
USAGE
|
||||
-----
|
||||
|
||||
To make use of your newly created mesh, batman advanced provides
|
||||
a new interface "bat0" which you should use from this point on.
|
||||
All interfaces added to batman advanced are not relevant any
|
||||
longer because batman handles them for you. Basically, one "hands
|
||||
over" the data by using the batman interface and batman will make
|
||||
sure it reaches its destination.
|
||||
|
||||
The "bat0" interface can be used like any other regular inter-
|
||||
face. It needs an IP address which can be either statically con-
|
||||
figured or dynamically (by using DHCP or similar services):
|
||||
|
||||
# NodeA: ifconfig bat0 192.168.0.1
|
||||
# NodeB: ifconfig bat0 192.168.0.2
|
||||
# NodeB: ping 192.168.0.1
|
||||
|
||||
Note: In order to avoid problems remove all IP addresses previ-
|
||||
ously assigned to interfaces now used by batman advanced, e.g.
|
||||
|
||||
# ifconfig eth0 0.0.0.0
|
||||
|
||||
|
||||
LOGGING/DEBUGGING
|
||||
-----------------
|
||||
|
||||
All error messages, warnings and information messages are sent to
|
||||
the kernel log. Depending on your operating system distribution
|
||||
this can be read in one of a number of ways. Try using the com-
|
||||
mands: dmesg, logread, or looking in the files /var/log/kern.log
|
||||
or /var/log/syslog. All batman-adv messages are prefixed with
|
||||
"batman-adv:" So to see just these messages try
|
||||
|
||||
# dmesg | grep batman-adv
|
||||
|
||||
When investigating problems with your mesh network it is some-
|
||||
times necessary to see more detail debug messages. This must be
|
||||
enabled when compiling the batman-adv module. When building bat-
|
||||
man-adv as part of kernel, use "make menuconfig" and enable the
|
||||
option "B.A.T.M.A.N. debugging".
|
||||
|
||||
Those additional debug messages can be accessed using a special
|
||||
file in debugfs
|
||||
|
||||
# cat /sys/kernel/debug/batman_adv/bat0/log
|
||||
|
||||
The additional debug output is by default disabled. It can be en-
|
||||
abled during run time. Following log_levels are defined:
|
||||
|
||||
0 - All debug output disabled
|
||||
1 - Enable messages related to routing / flooding / broadcasting
|
||||
2 - Enable messages related to route added / changed / deleted
|
||||
4 - Enable messages related to translation table operations
|
||||
8 - Enable messages related to bridge loop avoidance
|
||||
16 - Enable messaged related to DAT, ARP snooping and parsing
|
||||
31 - Enable all messages
|
||||
|
||||
The debug output can be changed at runtime using the file
|
||||
/sys/class/net/bat0/mesh/log_level. e.g.
|
||||
|
||||
# echo 6 > /sys/class/net/bat0/mesh/log_level
|
||||
|
||||
will enable debug messages for when routes change.
|
||||
|
||||
Counters for different types of packets entering and leaving the
|
||||
batman-adv module are available through ethtool:
|
||||
|
||||
# ethtool --statistics bat0
|
||||
|
||||
|
||||
BATCTL
|
||||
------
|
||||
|
||||
As batman advanced operates on layer 2 all hosts participating in
|
||||
the virtual switch are completely transparent for all protocols
|
||||
above layer 2. Therefore the common diagnosis tools do not work
|
||||
as expected. To overcome these problems batctl was created. At
|
||||
the moment the batctl contains ping, traceroute, tcpdump and
|
||||
interfaces to the kernel module settings.
|
||||
|
||||
For more information, please see the manpage (man batctl).
|
||||
|
||||
batctl is available on http://www.open-mesh.org/
|
||||
|
||||
|
||||
CONTACT
|
||||
-------
|
||||
|
||||
Please send us comments, experiences, questions, anything :)
|
||||
|
||||
IRC: #batman on irc.freenode.org
|
||||
Mailing-list: b.a.t.m.a.n@open-mesh.org (optional subscription
|
||||
at https://lists.open-mesh.org/mm/listinfo/b.a.t.m.a.n)
|
||||
|
||||
You can also contact the Authors:
|
||||
|
||||
Marek Lindner <mareklindner@neomailbox.ch>
|
||||
Simon Wunderlich <sw@simonwunderlich.de>
|
||||
158
Documentation/networking/baycom.txt
Normal file
158
Documentation/networking/baycom.txt
Normal file
|
|
@ -0,0 +1,158 @@
|
|||
LINUX DRIVERS FOR BAYCOM MODEMS
|
||||
|
||||
Thomas M. Sailer, HB9JNX/AE4WA, <sailer@ife.ee.ethz.ch>
|
||||
|
||||
!!NEW!! (04/98) The drivers for the baycom modems have been split into
|
||||
separate drivers as they did not share any code, and the driver
|
||||
and device names have changed.
|
||||
|
||||
This document describes the Linux Kernel Drivers for simple Baycom style
|
||||
amateur radio modems.
|
||||
|
||||
The following drivers are available:
|
||||
|
||||
baycom_ser_fdx:
|
||||
This driver supports the SER12 modems either full or half duplex.
|
||||
Its baud rate may be changed via the `baud' module parameter,
|
||||
therefore it supports just about every bit bang modem on a
|
||||
serial port. Its devices are called bcsf0 through bcsf3.
|
||||
This is the recommended driver for SER12 type modems,
|
||||
however if you have a broken UART clone that does not have working
|
||||
delta status bits, you may try baycom_ser_hdx.
|
||||
|
||||
baycom_ser_hdx:
|
||||
This is an alternative driver for SER12 type modems.
|
||||
It only supports half duplex, and only 1200 baud. Its devices
|
||||
are called bcsh0 through bcsh3. Use this driver only if baycom_ser_fdx
|
||||
does not work with your UART.
|
||||
|
||||
baycom_par:
|
||||
This driver supports the par96 and picpar modems.
|
||||
Its devices are called bcp0 through bcp3.
|
||||
|
||||
baycom_epp:
|
||||
This driver supports the EPP modem.
|
||||
Its devices are called bce0 through bce3.
|
||||
This driver is work-in-progress.
|
||||
|
||||
The following modems are supported:
|
||||
|
||||
ser12: This is a very simple 1200 baud AFSK modem. The modem consists only
|
||||
of a modulator/demodulator chip, usually a TI TCM3105. The computer
|
||||
is responsible for regenerating the receiver bit clock, as well as
|
||||
for handling the HDLC protocol. The modem connects to a serial port,
|
||||
hence the name. Since the serial port is not used as an async serial
|
||||
port, the kernel driver for serial ports cannot be used, and this
|
||||
driver only supports standard serial hardware (8250, 16450, 16550)
|
||||
|
||||
par96: This is a modem for 9600 baud FSK compatible to the G3RUH standard.
|
||||
The modem does all the filtering and regenerates the receiver clock.
|
||||
Data is transferred from and to the PC via a shift register.
|
||||
The shift register is filled with 16 bits and an interrupt is signalled.
|
||||
The PC then empties the shift register in a burst. This modem connects
|
||||
to the parallel port, hence the name. The modem leaves the
|
||||
implementation of the HDLC protocol and the scrambler polynomial to
|
||||
the PC.
|
||||
|
||||
picpar: This is a redesign of the par96 modem by Henning Rech, DF9IC. The modem
|
||||
is protocol compatible to par96, but uses only three low power ICs
|
||||
and can therefore be fed from the parallel port and does not require
|
||||
an additional power supply. Furthermore, it incorporates a carrier
|
||||
detect circuitry.
|
||||
|
||||
EPP: This is a high-speed modem adaptor that connects to an enhanced parallel port.
|
||||
Its target audience is users working over a high speed hub (76.8kbit/s).
|
||||
|
||||
eppfpga: This is a redesign of the EPP adaptor.
|
||||
|
||||
|
||||
|
||||
All of the above modems only support half duplex communications. However,
|
||||
the driver supports the KISS (see below) fullduplex command. It then simply
|
||||
starts to send as soon as there's a packet to transmit and does not care
|
||||
about DCD, i.e. it starts to send even if there's someone else on the channel.
|
||||
This command is required by some implementations of the DAMA channel
|
||||
access protocol.
|
||||
|
||||
|
||||
The Interface of the drivers
|
||||
|
||||
Unlike previous drivers, these drivers are no longer character devices,
|
||||
but they are now true kernel network interfaces. Installation is therefore
|
||||
simple. Once installed, four interfaces named bc{sf,sh,p,e}[0-3] are available.
|
||||
sethdlc from the ax25 utilities may be used to set driver states etc.
|
||||
Users of userland AX.25 stacks may use the net2kiss utility (also available
|
||||
in the ax25 utilities package) to convert packets of a network interface
|
||||
to a KISS stream on a pseudo tty. There's also a patch available from
|
||||
me for WAMPES which allows attaching a kernel network interface directly.
|
||||
|
||||
|
||||
Configuring the driver
|
||||
|
||||
Every time a driver is inserted into the kernel, it has to know which
|
||||
modems it should access at which ports. This can be done with the setbaycom
|
||||
utility. If you are only using one modem, you can also configure the
|
||||
driver from the insmod command line (or by means of an option line in
|
||||
/etc/modprobe.d/*.conf).
|
||||
|
||||
Examples:
|
||||
modprobe baycom_ser_fdx mode="ser12*" iobase=0x3f8 irq=4
|
||||
sethdlc -i bcsf0 -p mode "ser12*" io 0x3f8 irq 4
|
||||
|
||||
Both lines configure the first port to drive a ser12 modem at the first
|
||||
serial port (COM1 under DOS). The * in the mode parameter instructs the driver to use
|
||||
the software DCD algorithm (see below).
|
||||
|
||||
insmod baycom_par mode="picpar" iobase=0x378
|
||||
sethdlc -i bcp0 -p mode "picpar" io 0x378
|
||||
|
||||
Both lines configure the first port to drive a picpar modem at the
|
||||
first parallel port (LPT1 under DOS). (Note: picpar implies
|
||||
hardware DCD, par96 implies software DCD).
|
||||
|
||||
The channel access parameters can be set with sethdlc -a or kissparms.
|
||||
Note that both utilities interpret the values slightly differently.
|
||||
|
||||
|
||||
Hardware DCD versus Software DCD
|
||||
|
||||
To avoid collisions on the air, the driver must know when the channel is
|
||||
busy. This is the task of the DCD circuitry/software. The driver may either
|
||||
utilise a software DCD algorithm (options=1) or use a DCD signal from
|
||||
the hardware (options=0).
|
||||
|
||||
ser12: if software DCD is utilised, the radio's squelch should always be
|
||||
open. It is highly recommended to use the software DCD algorithm,
|
||||
as it is much faster than most hardware squelch circuitry. The
|
||||
disadvantage is a slightly higher load on the system.
|
||||
|
||||
par96: the software DCD algorithm for this type of modem is rather poor.
|
||||
The modem simply does not provide enough information to implement
|
||||
a reasonable DCD algorithm in software. Therefore, if your radio
|
||||
feeds the DCD input of the PAR96 modem, the use of the hardware
|
||||
DCD circuitry is recommended.
|
||||
|
||||
picpar: the picpar modem features a builtin DCD hardware, which is highly
|
||||
recommended.
|
||||
|
||||
|
||||
|
||||
Compatibility with the rest of the Linux kernel
|
||||
|
||||
The serial driver and the baycom serial drivers compete
|
||||
for the same hardware resources. Of course only one driver can access a given
|
||||
interface at a time. The serial driver grabs all interfaces it can find at
|
||||
startup time. Therefore the baycom drivers subsequently won't be able to
|
||||
access a serial port. You might therefore find it necessary to release
|
||||
a port owned by the serial driver with 'setserial /dev/ttyS# uart none', where
|
||||
# is the number of the interface. The baycom drivers do not reserve any
|
||||
ports at startup, unless one is specified on the 'insmod' command line. Another
|
||||
method to solve the problem is to compile all drivers as modules and
|
||||
leave it to kmod to load the correct driver depending on the application.
|
||||
|
||||
The parallel port drivers (baycom_par, baycom_epp) now use the parport subsystem
|
||||
to arbitrate the ports between different client drivers.
|
||||
|
||||
vy 73s de
|
||||
Tom Sailer, sailer@ife.ee.ethz.ch
|
||||
hb9jnx @ hb9w.ampr.org
|
||||
2746
Documentation/networking/bonding.txt
Normal file
2746
Documentation/networking/bonding.txt
Normal file
File diff suppressed because it is too large
Load diff
15
Documentation/networking/bridge.txt
Normal file
15
Documentation/networking/bridge.txt
Normal file
|
|
@ -0,0 +1,15 @@
|
|||
In order to use the Ethernet bridging functionality, you'll need the
|
||||
userspace tools.
|
||||
|
||||
Documentation for Linux bridging is on:
|
||||
http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge
|
||||
|
||||
The bridge-utilities are maintained at:
|
||||
git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/bridge-utils.git
|
||||
|
||||
Additionally, the iproute2 utilities can be used to configure
|
||||
bridge devices.
|
||||
|
||||
If you still have questions, don't hesitate to post to the mailing list
|
||||
(more info https://lists.linux-foundation.org/mailman/listinfo/bridge).
|
||||
|
||||
175
Documentation/networking/caif/Linux-CAIF.txt
Normal file
175
Documentation/networking/caif/Linux-CAIF.txt
Normal file
|
|
@ -0,0 +1,175 @@
|
|||
Linux CAIF
|
||||
===========
|
||||
copyright (C) ST-Ericsson AB 2010
|
||||
Author: Sjur Brendeland/ sjur.brandeland@stericsson.com
|
||||
License terms: GNU General Public License (GPL) version 2
|
||||
|
||||
|
||||
Introduction
|
||||
------------
|
||||
CAIF is a MUX protocol used by ST-Ericsson cellular modems for
|
||||
communication between Modem and host. The host processes can open virtual AT
|
||||
channels, initiate GPRS Data connections, Video channels and Utility Channels.
|
||||
The Utility Channels are general purpose pipes between modem and host.
|
||||
|
||||
ST-Ericsson modems support a number of transports between modem
|
||||
and host. Currently, UART and Loopback are available for Linux.
|
||||
|
||||
|
||||
Architecture:
|
||||
------------
|
||||
The implementation of CAIF is divided into:
|
||||
* CAIF Socket Layer and GPRS IP Interface.
|
||||
* CAIF Core Protocol Implementation
|
||||
* CAIF Link Layer, implemented as NET devices.
|
||||
|
||||
|
||||
RTNL
|
||||
!
|
||||
! +------+ +------+
|
||||
! +------+! +------+!
|
||||
! ! IP !! !Socket!!
|
||||
+-------> !interf!+ ! API !+ <- CAIF Client APIs
|
||||
! +------+ +------!
|
||||
! ! !
|
||||
! +-----------+
|
||||
! !
|
||||
! +------+ <- CAIF Core Protocol
|
||||
! ! CAIF !
|
||||
! ! Core !
|
||||
! +------+
|
||||
! +----------!---------+
|
||||
! ! ! !
|
||||
! +------+ +-----+ +------+
|
||||
+--> ! HSI ! ! TTY ! ! USB ! <- Link Layer (Net Devices)
|
||||
+------+ +-----+ +------+
|
||||
|
||||
|
||||
|
||||
I M P L E M E N T A T I O N
|
||||
===========================
|
||||
|
||||
|
||||
CAIF Core Protocol Layer
|
||||
=========================================
|
||||
|
||||
CAIF Core layer implements the CAIF protocol as defined by ST-Ericsson.
|
||||
It implements the CAIF protocol stack in a layered approach, where
|
||||
each layer described in the specification is implemented as a separate layer.
|
||||
The architecture is inspired by the design patterns "Protocol Layer" and
|
||||
"Protocol Packet".
|
||||
|
||||
== CAIF structure ==
|
||||
The Core CAIF implementation contains:
|
||||
- Simple implementation of CAIF.
|
||||
- Layered architecture (a la Streams), each layer in the CAIF
|
||||
specification is implemented in a separate c-file.
|
||||
- Clients must call configuration function to add PHY layer.
|
||||
- Clients must implement CAIF layer to consume/produce
|
||||
CAIF payload with receive and transmit functions.
|
||||
- Clients must call configuration function to add and connect the
|
||||
Client layer.
|
||||
- When receiving / transmitting CAIF Packets (cfpkt), ownership is passed
|
||||
to the called function (except for framing layers' receive function)
|
||||
|
||||
Layered Architecture
|
||||
--------------------
|
||||
The CAIF protocol can be divided into two parts: Support functions and Protocol
|
||||
Implementation. The support functions include:
|
||||
|
||||
- CFPKT CAIF Packet. Implementation of CAIF Protocol Packet. The
|
||||
CAIF Packet has functions for creating, destroying and adding content
|
||||
and for adding/extracting header and trailers to protocol packets.
|
||||
|
||||
The CAIF Protocol implementation contains:
|
||||
|
||||
- CFCNFG CAIF Configuration layer. Configures the CAIF Protocol
|
||||
Stack and provides a Client interface for adding Link-Layer and
|
||||
Driver interfaces on top of the CAIF Stack.
|
||||
|
||||
- CFCTRL CAIF Control layer. Encodes and Decodes control messages
|
||||
such as enumeration and channel setup. Also matches request and
|
||||
response messages.
|
||||
|
||||
- CFSERVL General CAIF Service Layer functionality; handles flow
|
||||
control and remote shutdown requests.
|
||||
|
||||
- CFVEI CAIF VEI layer. Handles CAIF AT Channels on VEI (Virtual
|
||||
External Interface). This layer encodes/decodes VEI frames.
|
||||
|
||||
- CFDGML CAIF Datagram layer. Handles CAIF Datagram layer (IP
|
||||
traffic), encodes/decodes Datagram frames.
|
||||
|
||||
- CFMUX CAIF Mux layer. Handles multiplexing between multiple
|
||||
physical bearers and multiple channels such as VEI, Datagram, etc.
|
||||
The MUX keeps track of the existing CAIF Channels and
|
||||
Physical Instances and selects the appropriate instance based
|
||||
on Channel-Id and Physical-ID.
|
||||
|
||||
- CFFRML CAIF Framing layer. Handles Framing i.e. Frame length
|
||||
and frame checksum.
|
||||
|
||||
- CFSERL CAIF Serial layer. Handles concatenation/split of frames
|
||||
into CAIF Frames with correct length.
|
||||
|
||||
|
||||
|
||||
+---------+
|
||||
| Config |
|
||||
| CFCNFG |
|
||||
+---------+
|
||||
!
|
||||
+---------+ +---------+ +---------+
|
||||
| AT | | Control | | Datagram|
|
||||
| CFVEIL | | CFCTRL | | CFDGML |
|
||||
+---------+ +---------+ +---------+
|
||||
\_____________!______________/
|
||||
!
|
||||
+---------+
|
||||
| MUX |
|
||||
| |
|
||||
+---------+
|
||||
_____!_____
|
||||
/ \
|
||||
+---------+ +---------+
|
||||
| CFFRML | | CFFRML |
|
||||
| Framing | | Framing |
|
||||
+---------+ +---------+
|
||||
! !
|
||||
+---------+ +---------+
|
||||
| | | Serial |
|
||||
| | | CFSERL |
|
||||
+---------+ +---------+
|
||||
|
||||
|
||||
In this layered approach the following "rules" apply.
|
||||
- All layers embed the same structure "struct cflayer"
|
||||
- A layer does not depend on any other layer's private data.
|
||||
- Layers are stacked by setting the pointers
|
||||
layer->up , layer->dn
|
||||
- In order to send data upwards, each layer should do
|
||||
layer->up->receive(layer->up, packet);
|
||||
- In order to send data downwards, each layer should do
|
||||
layer->dn->transmit(layer->dn, packet);
|
||||
|
||||
|
||||
CAIF Socket and IP interface
|
||||
===========================
|
||||
|
||||
The IP interface and CAIF socket API are implemented on top of the
|
||||
CAIF Core protocol. The IP Interface and CAIF socket have an instance of
|
||||
'struct cflayer', just like the CAIF Core protocol stack.
|
||||
Net device and Socket implement the 'receive()' function defined by
|
||||
'struct cflayer', just like the rest of the CAIF stack. In this way, transmit and
|
||||
receive of packets is handled as by the rest of the layers: the 'dn->transmit()'
|
||||
function is called in order to transmit data.
|
||||
|
||||
Configuration of Link Layer
|
||||
---------------------------
|
||||
The Link Layer is implemented as Linux network devices (struct net_device).
|
||||
Payload handling and registration is done using standard Linux mechanisms.
|
||||
|
||||
The CAIF Protocol relies on a loss-less link layer without implementing
|
||||
retransmission. This implies that packet drops must not happen.
|
||||
Therefore a flow-control mechanism is implemented where the physical
|
||||
interface can initiate flow stop for all CAIF Channels.
|
||||
109
Documentation/networking/caif/README
Normal file
109
Documentation/networking/caif/README
Normal file
|
|
@ -0,0 +1,109 @@
|
|||
Copyright (C) ST-Ericsson AB 2010
|
||||
Author: Sjur Brendeland/ sjur.brandeland@stericsson.com
|
||||
License terms: GNU General Public License (GPL) version 2
|
||||
---------------------------------------------------------
|
||||
|
||||
=== Start ===
|
||||
If you have compiled CAIF for modules do:
|
||||
|
||||
$modprobe crc_ccitt
|
||||
$modprobe caif
|
||||
$modprobe caif_socket
|
||||
$modprobe chnl_net
|
||||
|
||||
|
||||
=== Preparing the setup with a STE modem ===
|
||||
|
||||
If you are working on integration of CAIF you should make sure
|
||||
that the kernel is built with module support.
|
||||
|
||||
There are some things that need to be tweaked to get the host TTY correctly
|
||||
set up to talk to the modem.
|
||||
Since the CAIF stack is running in the kernel and we want to use the existing
|
||||
TTY, we are installing our physical serial driver as a line discipline above
|
||||
the TTY device.
|
||||
|
||||
To achieve this we need to install the N_CAIF ldisc from user space.
|
||||
The benefit is that we can hook up to any TTY.
|
||||
|
||||
The use of Start-of-frame-extension (STX) must also be set as
|
||||
module parameter "ser_use_stx".
|
||||
|
||||
Normally Frame Checksum is always used on UART, but this is also provided as a
|
||||
module parameter "ser_use_fcs".
|
||||
|
||||
$ modprobe caif_serial ser_ttyname=/dev/ttyS0 ser_use_stx=yes
|
||||
$ ifconfig caif_ttyS0 up
|
||||
|
||||
PLEASE NOTE: There is a limitation in Android shell.
|
||||
It only accepts one argument to insmod/modprobe!
|
||||
|
||||
=== Trouble shooting ===
|
||||
|
||||
There are debugfs parameters provided for serial communication.
|
||||
/sys/kernel/debug/caif_serial/<tty-name>/
|
||||
|
||||
* ser_state: Prints the bit-mask status where
|
||||
- 0x02 means SENDING, this is a transient state.
|
||||
- 0x10 means FLOW_OFF_SENT, i.e. the previous frame has not been sent
|
||||
and is blocking further send operation. Flow OFF has been propagated
|
||||
to all CAIF Channels using this TTY.
|
||||
|
||||
* tty_status: Prints the bit-mask tty status information
|
||||
- 0x01 - tty->warned is on.
|
||||
- 0x02 - tty->low_latency is on.
|
||||
- 0x04 - tty->packed is on.
|
||||
- 0x08 - tty->flow_stopped is on.
|
||||
- 0x10 - tty->hw_stopped is on.
|
||||
- 0x20 - tty->stopped is on.
|
||||
|
||||
* last_tx_msg: Binary blob Prints the last transmitted frame.
|
||||
This can be printed with
|
||||
$od --format=x1 /sys/kernel/debug/caif_serial/<tty>/last_rx_msg.
|
||||
The first two tx messages sent look like this. Note: The initial
|
||||
byte 02 is start of frame extension (STX) used for re-syncing
|
||||
upon errors.
|
||||
|
||||
- Enumeration:
|
||||
0000000 02 05 00 00 03 01 d2 02
|
||||
| | | | | |
|
||||
STX(1) | | | |
|
||||
Length(2)| | |
|
||||
Control Channel(1)
|
||||
Command:Enumeration(1)
|
||||
Link-ID(1)
|
||||
Checksum(2)
|
||||
- Channel Setup:
|
||||
0000000 02 07 00 00 00 21 a1 00 48 df
|
||||
| | | | | | | |
|
||||
STX(1) | | | | | |
|
||||
Length(2)| | | | |
|
||||
Control Channel(1)
|
||||
Command:Channel Setup(1)
|
||||
Channel Type(1)
|
||||
Priority and Link-ID(1)
|
||||
Endpoint(1)
|
||||
Checksum(2)
|
||||
|
||||
* last_rx_msg: Prints the last transmitted frame.
|
||||
The RX messages for LinkSetup look almost identical but they have the
|
||||
bit 0x20 set in the command bit, and Channel Setup has added one byte
|
||||
before Checksum containing Channel ID.
|
||||
NOTE: Several CAIF Messages might be concatenated. The maximum debug
|
||||
buffer size is 128 bytes.
|
||||
|
||||
== Error Scenarios:
|
||||
- last_tx_msg contains channel setup message and last_rx_msg is empty ->
|
||||
The host seems to be able to send over the UART, at least the CAIF ldisc get
|
||||
notified that sending is completed.
|
||||
|
||||
- last_tx_msg contains enumeration message and last_rx_msg is empty ->
|
||||
The host is not able to send the message from UART, the tty has not been
|
||||
able to complete the transmit operation.
|
||||
|
||||
- if /sys/kernel/debug/caif_serial/<tty>/tty_status is non-zero there
|
||||
might be problems transmitting over UART.
|
||||
E.g. host and modem wiring is not correct you will typically see
|
||||
tty_status = 0x10 (hw_stopped) and ser_state = 0x10 (FLOW_OFF_SENT).
|
||||
You will probably see the enumeration message in last_tx_message
|
||||
and empty last_rx_message.
|
||||
208
Documentation/networking/caif/spi_porting.txt
Normal file
208
Documentation/networking/caif/spi_porting.txt
Normal file
|
|
@ -0,0 +1,208 @@
|
|||
- CAIF SPI porting -
|
||||
|
||||
- CAIF SPI basics:
|
||||
|
||||
Running CAIF over SPI needs some extra setup, owing to the nature of SPI.
|
||||
Two extra GPIOs have been added in order to negotiate the transfers
|
||||
between the master and the slave. The minimum requirement for running
|
||||
CAIF over SPI is a SPI slave chip and two GPIOs (more details below).
|
||||
Please note that running as a slave implies that you need to keep up
|
||||
with the master clock. An overrun or underrun event is fatal.
|
||||
|
||||
- CAIF SPI framework:
|
||||
|
||||
To make porting as easy as possible, the CAIF SPI has been divided in
|
||||
two parts. The first part (called the interface part) deals with all
|
||||
generic functionality such as length framing, SPI frame negotiation
|
||||
and SPI frame delivery and transmission. The other part is the CAIF
|
||||
SPI slave device part, which is the module that you have to write if
|
||||
you want to run SPI CAIF on a new hardware. This part takes care of
|
||||
the physical hardware, both with regard to SPI and to GPIOs.
|
||||
|
||||
- Implementing a CAIF SPI device:
|
||||
|
||||
- Functionality provided by the CAIF SPI slave device:
|
||||
|
||||
In order to implement a SPI device you will, as a minimum,
|
||||
need to implement the following
|
||||
functions:
|
||||
|
||||
int (*init_xfer) (struct cfspi_xfer * xfer, struct cfspi_dev *dev):
|
||||
|
||||
This function is called by the CAIF SPI interface to give
|
||||
you a chance to set up your hardware to be ready to receive
|
||||
a stream of data from the master. The xfer structure contains
|
||||
both physical and logical addresses, as well as the total length
|
||||
of the transfer in both directions.The dev parameter can be used
|
||||
to map to different CAIF SPI slave devices.
|
||||
|
||||
void (*sig_xfer) (bool xfer, struct cfspi_dev *dev):
|
||||
|
||||
This function is called by the CAIF SPI interface when the output
|
||||
(SPI_INT) GPIO needs to change state. The boolean value of the xfer
|
||||
variable indicates whether the GPIO should be asserted (HIGH) or
|
||||
deasserted (LOW). The dev parameter can be used to map to different CAIF
|
||||
SPI slave devices.
|
||||
|
||||
- Functionality provided by the CAIF SPI interface:
|
||||
|
||||
void (*ss_cb) (bool assert, struct cfspi_ifc *ifc);
|
||||
|
||||
This function is called by the CAIF SPI slave device in order to
|
||||
signal a change of state of the input GPIO (SS) to the interface.
|
||||
Only active edges are mandatory to be reported.
|
||||
This function can be called from IRQ context (recommended in order
|
||||
not to introduce latency). The ifc parameter should be the pointer
|
||||
returned from the platform probe function in the SPI device structure.
|
||||
|
||||
void (*xfer_done_cb) (struct cfspi_ifc *ifc);
|
||||
|
||||
This function is called by the CAIF SPI slave device in order to
|
||||
report that a transfer is completed. This function should only be
|
||||
called once both the transmission and the reception are completed.
|
||||
This function can be called from IRQ context (recommended in order
|
||||
not to introduce latency). The ifc parameter should be the pointer
|
||||
returned from the platform probe function in the SPI device structure.
|
||||
|
||||
- Connecting the bits and pieces:
|
||||
|
||||
- Filling in the SPI slave device structure:
|
||||
|
||||
Connect the necessary callback functions.
|
||||
Indicate clock speed (used to calculate toggle delays).
|
||||
Chose a suitable name (helps debugging if you use several CAIF
|
||||
SPI slave devices).
|
||||
Assign your private data (can be used to map to your structure).
|
||||
|
||||
- Filling in the SPI slave platform device structure:
|
||||
Add name of driver to connect to ("cfspi_sspi").
|
||||
Assign the SPI slave device structure as platform data.
|
||||
|
||||
- Padding:
|
||||
|
||||
In order to optimize throughput, a number of SPI padding options are provided.
|
||||
Padding can be enabled independently for uplink and downlink transfers.
|
||||
Padding can be enabled for the head, the tail and for the total frame size.
|
||||
The padding needs to be correctly configured on both sides of the link.
|
||||
The padding can be changed via module parameters in cfspi_sspi.c or via
|
||||
the sysfs directory of the cfspi_sspi driver (before device registration).
|
||||
|
||||
- CAIF SPI device template:
|
||||
|
||||
/*
|
||||
* Copyright (C) ST-Ericsson AB 2010
|
||||
* Author: Daniel Martensson / Daniel.Martensson@stericsson.com
|
||||
* License terms: GNU General Public License (GPL), version 2.
|
||||
*
|
||||
*/
|
||||
|
||||
#include <linux/init.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/device.h>
|
||||
#include <linux/wait.h>
|
||||
#include <linux/interrupt.h>
|
||||
#include <linux/dma-mapping.h>
|
||||
#include <net/caif/caif_spi.h>
|
||||
|
||||
MODULE_LICENSE("GPL");
|
||||
|
||||
struct sspi_struct {
|
||||
struct cfspi_dev sdev;
|
||||
struct cfspi_xfer *xfer;
|
||||
};
|
||||
|
||||
static struct sspi_struct slave;
|
||||
static struct platform_device slave_device;
|
||||
|
||||
static irqreturn_t sspi_irq(int irq, void *arg)
|
||||
{
|
||||
/* You only need to trigger on an edge to the active state of the
|
||||
* SS signal. Once a edge is detected, the ss_cb() function should be
|
||||
* called with the parameter assert set to true. It is OK
|
||||
* (and even advised) to call the ss_cb() function in IRQ context in
|
||||
* order not to add any delay. */
|
||||
|
||||
return IRQ_HANDLED;
|
||||
}
|
||||
|
||||
static void sspi_complete(void *context)
|
||||
{
|
||||
/* Normally the DMA or the SPI framework will call you back
|
||||
* in something similar to this. The only thing you need to
|
||||
* do is to call the xfer_done_cb() function, providing the pointer
|
||||
* to the CAIF SPI interface. It is OK to call this function
|
||||
* from IRQ context. */
|
||||
}
|
||||
|
||||
static int sspi_init_xfer(struct cfspi_xfer *xfer, struct cfspi_dev *dev)
|
||||
{
|
||||
/* Store transfer info. For a normal implementation you should
|
||||
* set up your DMA here and make sure that you are ready to
|
||||
* receive the data from the master SPI. */
|
||||
|
||||
struct sspi_struct *sspi = (struct sspi_struct *)dev->priv;
|
||||
|
||||
sspi->xfer = xfer;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
void sspi_sig_xfer(bool xfer, struct cfspi_dev *dev)
|
||||
{
|
||||
/* If xfer is true then you should assert the SPI_INT to indicate to
|
||||
* the master that you are ready to receive the data from the master
|
||||
* SPI. If xfer is false then you should de-assert SPI_INT to indicate
|
||||
* that the transfer is done.
|
||||
*/
|
||||
|
||||
struct sspi_struct *sspi = (struct sspi_struct *)dev->priv;
|
||||
}
|
||||
|
||||
static void sspi_release(struct device *dev)
|
||||
{
|
||||
/*
|
||||
* Here you should release your SPI device resources.
|
||||
*/
|
||||
}
|
||||
|
||||
static int __init sspi_init(void)
|
||||
{
|
||||
/* Here you should initialize your SPI device by providing the
|
||||
* necessary functions, clock speed, name and private data. Once
|
||||
* done, you can register your device with the
|
||||
* platform_device_register() function. This function will return
|
||||
* with the CAIF SPI interface initialized. This is probably also
|
||||
* the place where you should set up your GPIOs, interrupts and SPI
|
||||
* resources. */
|
||||
|
||||
int res = 0;
|
||||
|
||||
/* Initialize slave device. */
|
||||
slave.sdev.init_xfer = sspi_init_xfer;
|
||||
slave.sdev.sig_xfer = sspi_sig_xfer;
|
||||
slave.sdev.clk_mhz = 13;
|
||||
slave.sdev.priv = &slave;
|
||||
slave.sdev.name = "spi_sspi";
|
||||
slave_device.dev.release = sspi_release;
|
||||
|
||||
/* Initialize platform device. */
|
||||
slave_device.name = "cfspi_sspi";
|
||||
slave_device.dev.platform_data = &slave.sdev;
|
||||
|
||||
/* Register platform device. */
|
||||
res = platform_device_register(&slave_device);
|
||||
if (res) {
|
||||
printk(KERN_WARNING "sspi_init: failed to register dev.\n");
|
||||
return -ENODEV;
|
||||
}
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
static void __exit sspi_exit(void)
|
||||
{
|
||||
platform_device_del(&slave_device);
|
||||
}
|
||||
|
||||
module_init(sspi_init);
|
||||
module_exit(sspi_exit);
|
||||
1200
Documentation/networking/can.txt
Normal file
1200
Documentation/networking/can.txt
Normal file
File diff suppressed because it is too large
Load diff
339
Documentation/networking/cdc_mbim.txt
Normal file
339
Documentation/networking/cdc_mbim.txt
Normal file
|
|
@ -0,0 +1,339 @@
|
|||
cdc_mbim - Driver for CDC MBIM Mobile Broadband modems
|
||||
========================================================
|
||||
|
||||
The cdc_mbim driver supports USB devices conforming to the "Universal
|
||||
Serial Bus Communications Class Subclass Specification for Mobile
|
||||
Broadband Interface Model" [1], which is a further development of
|
||||
"Universal Serial Bus Communications Class Subclass Specifications for
|
||||
Network Control Model Devices" [2] optimized for Mobile Broadband
|
||||
devices, aka "3G/LTE modems".
|
||||
|
||||
|
||||
Command Line Parameters
|
||||
=======================
|
||||
|
||||
The cdc_mbim driver has no parameters of its own. But the probing
|
||||
behaviour for NCM 1.0 backwards compatible MBIM functions (an
|
||||
"NCM/MBIM function" as defined in section 3.2 of [1]) is affected
|
||||
by a cdc_ncm driver parameter:
|
||||
|
||||
prefer_mbim
|
||||
-----------
|
||||
Type: Boolean
|
||||
Valid Range: N/Y (0-1)
|
||||
Default Value: Y (MBIM is preferred)
|
||||
|
||||
This parameter sets the system policy for NCM/MBIM functions. Such
|
||||
functions will be handled by either the cdc_ncm driver or the cdc_mbim
|
||||
driver depending on the prefer_mbim setting. Setting prefer_mbim=N
|
||||
makes the cdc_mbim driver ignore these functions and lets the cdc_ncm
|
||||
driver handle them instead.
|
||||
|
||||
The parameter is writable, and can be changed at any time. A manual
|
||||
unbind/bind is required to make the change effective for NCM/MBIM
|
||||
functions bound to the "wrong" driver
|
||||
|
||||
|
||||
Basic usage
|
||||
===========
|
||||
|
||||
MBIM functions are inactive when unmanaged. The cdc_mbim driver only
|
||||
provides an userspace interface to the MBIM control channel, and will
|
||||
not participate in the management of the function. This implies that a
|
||||
userspace MBIM management application always is required to enable a
|
||||
MBIM function.
|
||||
|
||||
Such userspace applications includes, but are not limited to:
|
||||
- mbimcli (included with the libmbim [3] library), and
|
||||
- ModemManager [4]
|
||||
|
||||
Establishing a MBIM IP session reequires at least these actions by the
|
||||
management application:
|
||||
- open the control channel
|
||||
- configure network connection settings
|
||||
- connect to network
|
||||
- configure IP interface
|
||||
|
||||
Management application development
|
||||
----------------------------------
|
||||
The driver <-> userspace interfaces are described below. The MBIM
|
||||
control channel protocol is described in [1].
|
||||
|
||||
|
||||
MBIM control channel userspace ABI
|
||||
==================================
|
||||
|
||||
/dev/cdc-wdmX character device
|
||||
------------------------------
|
||||
The driver creates a two-way pipe to the MBIM function control channel
|
||||
using the cdc-wdm driver as a subdriver. The userspace end of the
|
||||
control channel pipe is a /dev/cdc-wdmX character device.
|
||||
|
||||
The cdc_mbim driver does not process or police messages on the control
|
||||
channel. The channel is fully delegated to the userspace management
|
||||
application. It is therefore up to this application to ensure that it
|
||||
complies with all the control channel requirements in [1].
|
||||
|
||||
The cdc-wdmX device is created as a child of the MBIM control
|
||||
interface USB device. The character device associated with a specific
|
||||
MBIM function can be looked up using sysfs. For example:
|
||||
|
||||
bjorn@nemi:~$ ls /sys/bus/usb/drivers/cdc_mbim/2-4:2.12/usbmisc
|
||||
cdc-wdm0
|
||||
|
||||
bjorn@nemi:~$ grep . /sys/bus/usb/drivers/cdc_mbim/2-4:2.12/usbmisc/cdc-wdm0/dev
|
||||
180:0
|
||||
|
||||
|
||||
USB configuration descriptors
|
||||
-----------------------------
|
||||
The wMaxControlMessage field of the CDC MBIM functional descriptor
|
||||
limits the maximum control message size. The managament application is
|
||||
responsible for negotiating a control message size complying with the
|
||||
requirements in section 9.3.1 of [1], taking this descriptor field
|
||||
into consideration.
|
||||
|
||||
The userspace application can access the CDC MBIM functional
|
||||
descriptor of a MBIM function using either of the two USB
|
||||
configuration descriptor kernel interfaces described in [6] or [7].
|
||||
|
||||
See also the ioctl documentation below.
|
||||
|
||||
|
||||
Fragmentation
|
||||
-------------
|
||||
The userspace application is responsible for all control message
|
||||
fragmentation and defragmentaion, as described in section 9.5 of [1].
|
||||
|
||||
|
||||
/dev/cdc-wdmX write()
|
||||
---------------------
|
||||
The MBIM control messages from the management application *must not*
|
||||
exceed the negotiated control message size.
|
||||
|
||||
|
||||
/dev/cdc-wdmX read()
|
||||
--------------------
|
||||
The management application *must* accept control messages of up the
|
||||
negotiated control message size.
|
||||
|
||||
|
||||
/dev/cdc-wdmX ioctl()
|
||||
--------------------
|
||||
IOCTL_WDM_MAX_COMMAND: Get Maximum Command Size
|
||||
This ioctl returns the wMaxControlMessage field of the CDC MBIM
|
||||
functional descriptor for MBIM devices. This is intended as a
|
||||
convenience, eliminating the need to parse the USB descriptors from
|
||||
userspace.
|
||||
|
||||
#include <stdio.h>
|
||||
#include <fcntl.h>
|
||||
#include <sys/ioctl.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/usb/cdc-wdm.h>
|
||||
int main()
|
||||
{
|
||||
__u16 max;
|
||||
int fd = open("/dev/cdc-wdm0", O_RDWR);
|
||||
if (!ioctl(fd, IOCTL_WDM_MAX_COMMAND, &max))
|
||||
printf("wMaxControlMessage is %d\n", max);
|
||||
}
|
||||
|
||||
|
||||
Custom device services
|
||||
----------------------
|
||||
The MBIM specification allows vendors to freely define additional
|
||||
services. This is fully supported by the cdc_mbim driver.
|
||||
|
||||
Support for new MBIM services, including vendor specified services, is
|
||||
implemented entirely in userspace, like the rest of the MBIM control
|
||||
protocol
|
||||
|
||||
New services should be registered in the MBIM Registry [5].
|
||||
|
||||
|
||||
|
||||
MBIM data channel userspace ABI
|
||||
===============================
|
||||
|
||||
wwanY network device
|
||||
--------------------
|
||||
The cdc_mbim driver represents the MBIM data channel as a single
|
||||
network device of the "wwan" type. This network device is initially
|
||||
mapped to MBIM IP session 0.
|
||||
|
||||
|
||||
Multiplexed IP sessions (IPS)
|
||||
-----------------------------
|
||||
MBIM allows multiplexing up to 256 IP sessions over a single USB data
|
||||
channel. The cdc_mbim driver models such IP sessions as 802.1q VLAN
|
||||
subdevices of the master wwanY device, mapping MBIM IP session Z to
|
||||
VLAN ID Z for all values of Z greater than 0.
|
||||
|
||||
The device maximum Z is given in the MBIM_DEVICE_CAPS_INFO structure
|
||||
described in section 10.5.1 of [1].
|
||||
|
||||
The userspace management application is responsible for adding new
|
||||
VLAN links prior to establishing MBIM IP sessions where the SessionId
|
||||
is greater than 0. These links can be added by using the normal VLAN
|
||||
kernel interfaces, either ioctl or netlink.
|
||||
|
||||
For example, adding a link for a MBIM IP session with SessionId 3:
|
||||
|
||||
ip link add link wwan0 name wwan0.3 type vlan id 3
|
||||
|
||||
The driver will automatically map the "wwan0.3" network device to MBIM
|
||||
IP session 3.
|
||||
|
||||
|
||||
Device Service Streams (DSS)
|
||||
----------------------------
|
||||
MBIM also allows up to 256 non-IP data streams to be multiplexed over
|
||||
the same shared USB data channel. The cdc_mbim driver models these
|
||||
sessions as another set of 802.1q VLAN subdevices of the master wwanY
|
||||
device, mapping MBIM DSS session A to VLAN ID (256 + A) for all values
|
||||
of A.
|
||||
|
||||
The device maximum A is given in the MBIM_DEVICE_SERVICES_INFO
|
||||
structure described in section 10.5.29 of [1].
|
||||
|
||||
The DSS VLAN subdevices are used as a practical interface between the
|
||||
shared MBIM data channel and a MBIM DSS aware userspace application.
|
||||
It is not intended to be presented as-is to an end user. The
|
||||
assumption is that an userspace application initiating a DSS session
|
||||
also takes care of the necessary framing of the DSS data, presenting
|
||||
the stream to the end user in an appropriate way for the stream type.
|
||||
|
||||
The network device ABI requires a dummy ethernet header for every DSS
|
||||
data frame being transported. The contents of this header is
|
||||
arbitrary, with the following exceptions:
|
||||
- TX frames using an IP protocol (0x0800 or 0x86dd) will be dropped
|
||||
- RX frames will have the protocol field set to ETH_P_802_3 (but will
|
||||
not be properly formatted 802.3 frames)
|
||||
- RX frames will have the destination address set to the hardware
|
||||
address of the master device
|
||||
|
||||
The DSS supporting userspace management application is responsible for
|
||||
adding the dummy ethernet header on TX and stripping it on RX.
|
||||
|
||||
This is a simple example using tools commonly available, exporting
|
||||
DssSessionId 5 as a pty character device pointed to by a /dev/nmea
|
||||
symlink:
|
||||
|
||||
ip link add link wwan0 name wwan0.dss5 type vlan id 261
|
||||
ip link set dev wwan0.dss5 up
|
||||
socat INTERFACE:wwan0.dss5,type=2 PTY:,echo=0,link=/dev/nmea
|
||||
|
||||
This is only an example, most suitable for testing out a DSS
|
||||
service. Userspace applications supporting specific MBIM DSS services
|
||||
are expected to use the tools and programming interfaces required by
|
||||
that service.
|
||||
|
||||
Note that adding VLAN links for DSS sessions is entirely optional. A
|
||||
management application may instead choose to bind a packet socket
|
||||
directly to the master network device, using the received VLAN tags to
|
||||
map frames to the correct DSS session and adding 18 byte VLAN ethernet
|
||||
headers with the appropriate tag on TX. In this case using a socket
|
||||
filter is recommended, matching only the DSS VLAN subset. This avoid
|
||||
unnecessary copying of unrelated IP session data to userspace. For
|
||||
example:
|
||||
|
||||
static struct sock_filter dssfilter[] = {
|
||||
/* use special negative offsets to get VLAN tag */
|
||||
BPF_STMT(BPF_LD|BPF_B|BPF_ABS, SKF_AD_OFF + SKF_AD_VLAN_TAG_PRESENT),
|
||||
BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, 1, 0, 6), /* true */
|
||||
|
||||
/* verify DSS VLAN range */
|
||||
BPF_STMT(BPF_LD|BPF_H|BPF_ABS, SKF_AD_OFF + SKF_AD_VLAN_TAG),
|
||||
BPF_JUMP(BPF_JMP|BPF_JGE|BPF_K, 256, 0, 4), /* 256 is first DSS VLAN */
|
||||
BPF_JUMP(BPF_JMP|BPF_JGE|BPF_K, 512, 3, 0), /* 511 is last DSS VLAN */
|
||||
|
||||
/* verify ethertype */
|
||||
BPF_STMT(BPF_LD|BPF_H|BPF_ABS, 2 * ETH_ALEN),
|
||||
BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, ETH_P_802_3, 0, 1),
|
||||
|
||||
BPF_STMT(BPF_RET|BPF_K, (u_int)-1), /* accept */
|
||||
BPF_STMT(BPF_RET|BPF_K, 0), /* ignore */
|
||||
};
|
||||
|
||||
|
||||
|
||||
Tagged IP session 0 VLAN
|
||||
------------------------
|
||||
As described above, MBIM IP session 0 is treated as special by the
|
||||
driver. It is initially mapped to untagged frames on the wwanY
|
||||
network device.
|
||||
|
||||
This mapping implies a few restrictions on multiplexed IPS and DSS
|
||||
sessions, which may not always be practical:
|
||||
- no IPS or DSS session can use a frame size greater than the MTU on
|
||||
IP session 0
|
||||
- no IPS or DSS session can be in the up state unless the network
|
||||
device representing IP session 0 also is up
|
||||
|
||||
These problems can be avoided by optionally making the driver map IP
|
||||
session 0 to a VLAN subdevice, similar to all other IP sessions. This
|
||||
behaviour is triggered by adding a VLAN link for the magic VLAN ID
|
||||
4094. The driver will then immediately start mapping MBIM IP session
|
||||
0 to this VLAN, and will drop untagged frames on the master wwanY
|
||||
device.
|
||||
|
||||
Tip: It might be less confusing to the end user to name this VLAN
|
||||
subdevice after the MBIM SessionID instead of the VLAN ID. For
|
||||
example:
|
||||
|
||||
ip link add link wwan0 name wwan0.0 type vlan id 4094
|
||||
|
||||
|
||||
VLAN mapping
|
||||
------------
|
||||
|
||||
Summarizing the cdc_mbim driver mapping described above, we have this
|
||||
relationship between VLAN tags on the wwanY network device and MBIM
|
||||
sessions on the shared USB data channel:
|
||||
|
||||
VLAN ID MBIM type MBIM SessionID Notes
|
||||
---------------------------------------------------------
|
||||
untagged IPS 0 a)
|
||||
1 - 255 IPS 1 - 255 <VLANID>
|
||||
256 - 511 DSS 0 - 255 <VLANID - 256>
|
||||
512 - 4093 b)
|
||||
4094 IPS 0 c)
|
||||
|
||||
a) if no VLAN ID 4094 link exists, else dropped
|
||||
b) unsupported VLAN range, unconditionally dropped
|
||||
c) if a VLAN ID 4094 link exists, else dropped
|
||||
|
||||
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
[1] USB Implementers Forum, Inc. - "Universal Serial Bus
|
||||
Communications Class Subclass Specification for Mobile Broadband
|
||||
Interface Model", Revision 1.0 (Errata 1), May 1, 2013
|
||||
- http://www.usb.org/developers/docs/devclass_docs/
|
||||
|
||||
[2] USB Implementers Forum, Inc. - "Universal Serial Bus
|
||||
Communications Class Subclass Specifications for Network Control
|
||||
Model Devices", Revision 1.0 (Errata 1), November 24, 2010
|
||||
- http://www.usb.org/developers/docs/devclass_docs/
|
||||
|
||||
[3] libmbim - "a glib-based library for talking to WWAN modems and
|
||||
devices which speak the Mobile Interface Broadband Model (MBIM)
|
||||
protocol"
|
||||
- http://www.freedesktop.org/wiki/Software/libmbim/
|
||||
|
||||
[4] ModemManager - "a DBus-activated daemon which controls mobile
|
||||
broadband (2G/3G/4G) devices and connections"
|
||||
- http://www.freedesktop.org/wiki/Software/ModemManager/
|
||||
|
||||
[5] "MBIM (Mobile Broadband Interface Model) Registry"
|
||||
- http://compliance.usb.org/mbim/
|
||||
|
||||
[6] "/proc/bus/usb filesystem output"
|
||||
- Documentation/usb/proc_usb_info.txt
|
||||
|
||||
[7] "/sys/bus/usb/devices/.../descriptors"
|
||||
- Documentation/ABI/stable/sysfs-bus-usb
|
||||
63
Documentation/networking/cops.txt
Normal file
63
Documentation/networking/cops.txt
Normal file
|
|
@ -0,0 +1,63 @@
|
|||
Text File for the COPS LocalTalk Linux driver (cops.c).
|
||||
By Jay Schulist <jschlst@samba.org>
|
||||
|
||||
This driver has two modes and they are: Dayna mode and Tangent mode.
|
||||
Each mode corresponds with the type of card. It has been found
|
||||
that there are 2 main types of cards and all other cards are
|
||||
the same and just have different names or only have minor differences
|
||||
such as more IO ports. As this driver is tested it will
|
||||
become more clear exactly what cards are supported.
|
||||
|
||||
Right now these cards are known to work with the COPS driver. The
|
||||
LT-200 cards work in a somewhat more limited capacity than the
|
||||
DL200 cards, which work very well and are in use by many people.
|
||||
|
||||
TANGENT driver mode:
|
||||
Tangent ATB-II, Novell NL-1000, Daystar Digital LT-200
|
||||
DAYNA driver mode:
|
||||
Dayna DL2000/DaynaTalk PC (Half Length), COPS LT-95,
|
||||
Farallon PhoneNET PC III, Farallon PhoneNET PC II
|
||||
Other cards possibly supported mode unknown though:
|
||||
Dayna DL2000 (Full length)
|
||||
|
||||
The COPS driver defaults to using Dayna mode. To change the driver's
|
||||
mode if you built a driver with dual support use board_type=1 or
|
||||
board_type=2 for Dayna or Tangent with insmod.
|
||||
|
||||
** Operation/loading of the driver.
|
||||
Use modprobe like this: /sbin/modprobe cops.o (IO #) (IRQ #)
|
||||
If you do not specify any options the driver will try and use the IO = 0x240,
|
||||
IRQ = 5. As of right now I would only use IRQ 5 for the card, if autoprobing.
|
||||
|
||||
To load multiple COPS driver Localtalk cards you can do one of the following.
|
||||
|
||||
insmod cops io=0x240 irq=5
|
||||
insmod -o cops2 cops io=0x260 irq=3
|
||||
|
||||
Or in lilo.conf put something like this:
|
||||
append="ether=5,0x240,lt0 ether=3,0x260,lt1"
|
||||
|
||||
Then bring up the interface with ifconfig. It will look something like this:
|
||||
lt0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-F7-00-00-00-00-00-00-00-00
|
||||
inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0
|
||||
UP BROADCAST RUNNING NOARP MULTICAST MTU:600 Metric:1
|
||||
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
|
||||
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 coll:0
|
||||
|
||||
** Netatalk Configuration
|
||||
You will need to configure atalkd with something like the following to make
|
||||
it work with the cops.c driver.
|
||||
|
||||
* For single LTalk card use.
|
||||
dummy -seed -phase 2 -net 2000 -addr 2000.10 -zone "1033"
|
||||
lt0 -seed -phase 1 -net 1000 -addr 1000.50 -zone "1033"
|
||||
|
||||
* For multiple cards, Ethernet and LocalTalk.
|
||||
eth0 -seed -phase 2 -net 3000 -addr 3000.20 -zone "1033"
|
||||
lt0 -seed -phase 1 -net 1000 -addr 1000.50 -zone "1033"
|
||||
|
||||
* For multiple LocalTalk cards, and an Ethernet card.
|
||||
* Order seems to matter here, Ethernet last.
|
||||
lt0 -seed -phase 1 -net 1000 -addr 1000.10 -zone "LocalTalk1"
|
||||
lt1 -seed -phase 1 -net 2000 -addr 2000.20 -zone "LocalTalk2"
|
||||
eth0 -seed -phase 2 -net 3000 -addr 3000.30 -zone "EtherTalk"
|
||||
624
Documentation/networking/cs89x0.txt
Normal file
624
Documentation/networking/cs89x0.txt
Normal file
|
|
@ -0,0 +1,624 @@
|
|||
|
||||
NOTE
|
||||
----
|
||||
|
||||
This document was contributed by Cirrus Logic for kernel 2.2.5. This version
|
||||
has been updated for 2.3.48 by Andrew Morton.
|
||||
|
||||
Cirrus make a copy of this driver available at their website, as
|
||||
described below. In general, you should use the driver version which
|
||||
comes with your Linux distribution.
|
||||
|
||||
|
||||
|
||||
CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS
|
||||
Linux Network Interface Driver ver. 2.00 <kernel 2.3.48>
|
||||
===============================================================================
|
||||
|
||||
|
||||
TABLE OF CONTENTS
|
||||
|
||||
1.0 CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS
|
||||
1.1 Product Overview
|
||||
1.2 Driver Description
|
||||
1.2.1 Driver Name
|
||||
1.2.2 File in the Driver Package
|
||||
1.3 System Requirements
|
||||
1.4 Licensing Information
|
||||
|
||||
2.0 ADAPTER INSTALLATION and CONFIGURATION
|
||||
2.1 CS8900-based Adapter Configuration
|
||||
2.2 CS8920-based Adapter Configuration
|
||||
|
||||
3.0 LOADING THE DRIVER AS A MODULE
|
||||
|
||||
4.0 COMPILING THE DRIVER
|
||||
4.1 Compiling the Driver as a Loadable Module
|
||||
4.2 Compiling the driver to support memory mode
|
||||
4.3 Compiling the driver to support Rx DMA
|
||||
|
||||
5.0 TESTING AND TROUBLESHOOTING
|
||||
5.1 Known Defects and Limitations
|
||||
5.2 Testing the Adapter
|
||||
5.2.1 Diagnostic Self-Test
|
||||
5.2.2 Diagnostic Network Test
|
||||
5.3 Using the Adapter's LEDs
|
||||
5.4 Resolving I/O Conflicts
|
||||
|
||||
6.0 TECHNICAL SUPPORT
|
||||
6.1 Contacting Cirrus Logic's Technical Support
|
||||
6.2 Information Required Before Contacting Technical Support
|
||||
6.3 Obtaining the Latest Driver Version
|
||||
6.4 Current maintainer
|
||||
6.5 Kernel boot parameters
|
||||
|
||||
|
||||
1.0 CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS
|
||||
===============================================================================
|
||||
|
||||
|
||||
1.1 PRODUCT OVERVIEW
|
||||
|
||||
The CS8900-based ISA Ethernet Adapters from Cirrus Logic follow
|
||||
IEEE 802.3 standards and support half or full-duplex operation in ISA bus
|
||||
computers on 10 Mbps Ethernet networks. The adapters are designed for operation
|
||||
in 16-bit ISA or EISA bus expansion slots and are available in
|
||||
10BaseT-only or 3-media configurations (10BaseT, 10Base2, and AUI for 10Base-5
|
||||
or fiber networks).
|
||||
|
||||
CS8920-based adapters are similar to the CS8900-based adapter with additional
|
||||
features for Plug and Play (PnP) support and Wakeup Frame recognition. As
|
||||
such, the configuration procedures differ somewhat between the two types of
|
||||
adapters. Refer to the "Adapter Configuration" section for details on
|
||||
configuring both types of adapters.
|
||||
|
||||
|
||||
1.2 DRIVER DESCRIPTION
|
||||
|
||||
The CS8900/CS8920 Ethernet Adapter driver for Linux supports the Linux
|
||||
v2.3.48 or greater kernel. It can be compiled directly into the kernel
|
||||
or loaded at run-time as a device driver module.
|
||||
|
||||
1.2.1 Driver Name: cs89x0
|
||||
|
||||
1.2.2 Files in the Driver Archive:
|
||||
|
||||
The files in the driver at Cirrus' website include:
|
||||
|
||||
readme.txt - this file
|
||||
build - batch file to compile cs89x0.c.
|
||||
cs89x0.c - driver C code
|
||||
cs89x0.h - driver header file
|
||||
cs89x0.o - pre-compiled module (for v2.2.5 kernel)
|
||||
config/Config.in - sample file to include cs89x0 driver in the kernel.
|
||||
config/Makefile - sample file to include cs89x0 driver in the kernel.
|
||||
config/Space.c - sample file to include cs89x0 driver in the kernel.
|
||||
|
||||
|
||||
|
||||
1.3 SYSTEM REQUIREMENTS
|
||||
|
||||
The following hardware is required:
|
||||
|
||||
* Cirrus Logic LAN (CS8900/20-based) Ethernet ISA Adapter
|
||||
|
||||
* IBM or IBM-compatible PC with:
|
||||
* An 80386 or higher processor
|
||||
* 16 bytes of contiguous IO space available between 210h - 370h
|
||||
* One available IRQ (5,10,11,or 12 for the CS8900, 3-7,9-15 for CS8920).
|
||||
|
||||
* Appropriate cable (and connector for AUI, 10BASE-2) for your network
|
||||
topology.
|
||||
|
||||
The following software is required:
|
||||
|
||||
* LINUX kernel version 2.3.48 or higher
|
||||
|
||||
* CS8900/20 Setup Utility (DOS-based)
|
||||
|
||||
* LINUX kernel sources for your kernel (if compiling into kernel)
|
||||
|
||||
* GNU Toolkit (gcc and make) v2.6 or above (if compiling into kernel
|
||||
or a module)
|
||||
|
||||
|
||||
|
||||
1.4 LICENSING INFORMATION
|
||||
|
||||
This program is free software; you can redistribute it and/or modify it under
|
||||
the terms of the GNU General Public License as published by the Free Software
|
||||
Foundation, version 1.
|
||||
|
||||
This program is distributed in the hope that it will be useful, but WITHOUT
|
||||
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
||||
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
|
||||
more details.
|
||||
|
||||
For a full copy of the GNU General Public License, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
|
||||
|
||||
2.0 ADAPTER INSTALLATION and CONFIGURATION
|
||||
===============================================================================
|
||||
|
||||
Both the CS8900 and CS8920-based adapters can be configured using parameters
|
||||
stored in an on-board EEPROM. You must use the DOS-based CS8900/20 Setup
|
||||
Utility if you want to change the adapter's configuration in EEPROM.
|
||||
|
||||
When loading the driver as a module, you can specify many of the adapter's
|
||||
configuration parameters on the command-line to override the EEPROM's settings
|
||||
or for interface configuration when an EEPROM is not used. (CS8920-based
|
||||
adapters must use an EEPROM.) See Section 3.0 LOADING THE DRIVER AS A MODULE.
|
||||
|
||||
Since the CS8900/20 Setup Utility is a DOS-based application, you must install
|
||||
and configure the adapter in a DOS-based system using the CS8900/20 Setup
|
||||
Utility before installation in the target LINUX system. (Not required if
|
||||
installing a CS8900-based adapter and the default configuration is acceptable.)
|
||||
|
||||
|
||||
2.1 CS8900-BASED ADAPTER CONFIGURATION
|
||||
|
||||
CS8900-based adapters shipped from Cirrus Logic have been configured
|
||||
with the following "default" settings:
|
||||
|
||||
Operation Mode: Memory Mode
|
||||
IRQ: 10
|
||||
Base I/O Address: 300
|
||||
Memory Base Address: D0000
|
||||
Optimization: DOS Client
|
||||
Transmission Mode: Half-duplex
|
||||
BootProm: None
|
||||
Media Type: Autodetect (3-media cards) or
|
||||
10BASE-T (10BASE-T only adapter)
|
||||
|
||||
You should only change the default configuration settings if conflicts with
|
||||
another adapter exists. To change the adapter's configuration, run the
|
||||
CS8900/20 Setup Utility.
|
||||
|
||||
|
||||
2.2 CS8920-BASED ADAPTER CONFIGURATION
|
||||
|
||||
CS8920-based adapters are shipped from Cirrus Logic configured as Plug
|
||||
and Play (PnP) enabled. However, since the cs89x0 driver does NOT
|
||||
support PnP, you must install the CS8920 adapter in a DOS-based PC and
|
||||
run the CS8900/20 Setup Utility to disable PnP and configure the
|
||||
adapter before installation in the target Linux system. Failure to do
|
||||
this will leave the adapter inactive and the driver will be unable to
|
||||
communicate with the adapter.
|
||||
|
||||
|
||||
****************************************************************
|
||||
* CS8920-BASED ADAPTERS: *
|
||||
* *
|
||||
* CS8920-BASED ADAPTERS ARE PLUG and PLAY ENABLED BY DEFAULT. *
|
||||
* THE CS89X0 DRIVER DOES NOT SUPPORT PnP. THEREFORE, YOU MUST *
|
||||
* RUN THE CS8900/20 SETUP UTILITY TO DISABLE PnP SUPPORT AND *
|
||||
* TO ACTIVATE THE ADAPTER. *
|
||||
****************************************************************
|
||||
|
||||
|
||||
|
||||
|
||||
3.0 LOADING THE DRIVER AS A MODULE
|
||||
===============================================================================
|
||||
|
||||
If the driver is compiled as a loadable module, you can load the driver module
|
||||
with the 'modprobe' command. Many of the adapter's configuration parameters can
|
||||
be specified as command-line arguments to the load command. This facility
|
||||
provides a means to override the EEPROM's settings or for interface
|
||||
configuration when an EEPROM is not used.
|
||||
|
||||
Example:
|
||||
|
||||
insmod cs89x0.o io=0x200 irq=0xA media=aui
|
||||
|
||||
This example loads the module and configures the adapter to use an IO port base
|
||||
address of 200h, interrupt 10, and use the AUI media connection. The following
|
||||
configuration options are available on the command line:
|
||||
|
||||
* io=### - specify IO address (200h-360h)
|
||||
* irq=## - specify interrupt level
|
||||
* use_dma=1 - Enable DMA
|
||||
* dma=# - specify dma channel (Driver is compiled to support
|
||||
Rx DMA only)
|
||||
* dmasize=# (16 or 64) - DMA size 16K or 64K. Default value is set to 16.
|
||||
* media=rj45 - specify media type
|
||||
or media=bnc
|
||||
or media=aui
|
||||
or media=auto
|
||||
* duplex=full - specify forced half/full/autonegotiate duplex
|
||||
or duplex=half
|
||||
or duplex=auto
|
||||
* debug=# - debug level (only available if the driver was compiled
|
||||
for debugging)
|
||||
|
||||
NOTES:
|
||||
|
||||
a) If an EEPROM is present, any specified command-line parameter
|
||||
will override the corresponding configuration value stored in
|
||||
EEPROM.
|
||||
|
||||
b) The "io" parameter must be specified on the command-line.
|
||||
|
||||
c) The driver's hardware probe routine is designed to avoid
|
||||
writing to I/O space until it knows that there is a cs89x0
|
||||
card at the written addresses. This could cause problems
|
||||
with device probing. To avoid this behaviour, add one
|
||||
to the `io=' module parameter. This doesn't actually change
|
||||
the I/O address, but it is a flag to tell the driver
|
||||
to partially initialise the hardware before trying to
|
||||
identify the card. This could be dangerous if you are
|
||||
not sure that there is a cs89x0 card at the provided address.
|
||||
|
||||
For example, to scan for an adapter located at IO base 0x300,
|
||||
specify an IO address of 0x301.
|
||||
|
||||
d) The "duplex=auto" parameter is only supported for the CS8920.
|
||||
|
||||
e) The minimum command-line configuration required if an EEPROM is
|
||||
not present is:
|
||||
|
||||
io
|
||||
irq
|
||||
media type (no autodetect)
|
||||
|
||||
f) The following additional parameters are CS89XX defaults (values
|
||||
used with no EEPROM or command-line argument).
|
||||
|
||||
* DMA Burst = enabled
|
||||
* IOCHRDY Enabled = enabled
|
||||
* UseSA = enabled
|
||||
* CS8900 defaults to half-duplex if not specified on command-line
|
||||
* CS8920 defaults to autoneg if not specified on command-line
|
||||
* Use reset defaults for other config parameters
|
||||
* dma_mode = 0
|
||||
|
||||
g) You can use ifconfig to set the adapter's Ethernet address.
|
||||
|
||||
h) Many Linux distributions use the 'modprobe' command to load
|
||||
modules. This program uses the '/etc/conf.modules' file to
|
||||
determine configuration information which is passed to a driver
|
||||
module when it is loaded. All the configuration options which are
|
||||
described above may be placed within /etc/conf.modules.
|
||||
|
||||
For example:
|
||||
|
||||
> cat /etc/conf.modules
|
||||
...
|
||||
alias eth0 cs89x0
|
||||
options cs89x0 io=0x0200 dma=5 use_dma=1
|
||||
...
|
||||
|
||||
In this example we are telling the module system that the
|
||||
ethernet driver for this machine should use the cs89x0 driver. We
|
||||
are asking 'modprobe' to pass the 'io', 'dma' and 'use_dma'
|
||||
arguments to the driver when it is loaded.
|
||||
|
||||
i) Cirrus recommend that the cs89x0 use the ISA DMA channels 5, 6 or
|
||||
7. You will probably find that other DMA channels will not work.
|
||||
|
||||
j) The cs89x0 supports DMA for receiving only. DMA mode is
|
||||
significantly more efficient. Flooding a 400 MHz Celeron machine
|
||||
with large ping packets consumes 82% of its CPU capacity in non-DMA
|
||||
mode. With DMA this is reduced to 45%.
|
||||
|
||||
k) If your Linux kernel was compiled with inbuilt plug-and-play
|
||||
support you will be able to find information about the cs89x0 card
|
||||
with the command
|
||||
|
||||
cat /proc/isapnp
|
||||
|
||||
l) If during DMA operation you find erratic behavior or network data
|
||||
corruption you should use your PC's BIOS to slow the EISA bus clock.
|
||||
|
||||
m) If the cs89x0 driver is compiled directly into the kernel
|
||||
(non-modular) then its I/O address is automatically determined by
|
||||
ISA bus probing. The IRQ number, media options, etc are determined
|
||||
from the card's EEPROM.
|
||||
|
||||
n) If the cs89x0 driver is compiled directly into the kernel, DMA
|
||||
mode may be selected by providing the kernel with a boot option
|
||||
'cs89x0_dma=N' where 'N' is the desired DMA channel number (5, 6 or 7).
|
||||
|
||||
Kernel boot options may be provided on the LILO command line:
|
||||
|
||||
LILO boot: linux cs89x0_dma=5
|
||||
|
||||
or they may be placed in /etc/lilo.conf:
|
||||
|
||||
image=/boot/bzImage-2.3.48
|
||||
append="cs89x0_dma=5"
|
||||
label=linux
|
||||
root=/dev/hda5
|
||||
read-only
|
||||
|
||||
The DMA Rx buffer size is hardwired to 16 kbytes in this mode.
|
||||
(64k mode is not available).
|
||||
|
||||
|
||||
4.0 COMPILING THE DRIVER
|
||||
===============================================================================
|
||||
|
||||
The cs89x0 driver can be compiled directly into the kernel or compiled into
|
||||
a loadable device driver module.
|
||||
|
||||
|
||||
4.1 COMPILING THE DRIVER AS A LOADABLE MODULE
|
||||
|
||||
To compile the driver into a loadable module, use the following command
|
||||
(single command line, without quotes):
|
||||
|
||||
"gcc -D__KERNEL__ -I/usr/src/linux/include -I/usr/src/linux/net/inet -Wall
|
||||
-Wstrict-prototypes -O2 -fomit-frame-pointer -DMODULE -DCONFIG_MODVERSIONS
|
||||
-c cs89x0.c"
|
||||
|
||||
4.2 COMPILING THE DRIVER TO SUPPORT MEMORY MODE
|
||||
|
||||
Support for memory mode was not carried over into the 2.3 series kernels.
|
||||
|
||||
4.3 COMPILING THE DRIVER TO SUPPORT Rx DMA
|
||||
|
||||
The compile-time optionality for DMA was removed in the 2.3 kernel
|
||||
series. DMA support is now unconditionally part of the driver. It is
|
||||
enabled by the 'use_dma=1' module option.
|
||||
|
||||
|
||||
5.0 TESTING AND TROUBLESHOOTING
|
||||
===============================================================================
|
||||
|
||||
5.1 KNOWN DEFECTS and LIMITATIONS
|
||||
|
||||
Refer to the RELEASE.TXT file distributed as part of this archive for a list of
|
||||
known defects, driver limitations, and work arounds.
|
||||
|
||||
|
||||
5.2 TESTING THE ADAPTER
|
||||
|
||||
Once the adapter has been installed and configured, the diagnostic option of
|
||||
the CS8900/20 Setup Utility can be used to test the functionality of the
|
||||
adapter and its network connection. Use the diagnostics 'Self Test' option to
|
||||
test the functionality of the adapter with the hardware configuration you have
|
||||
assigned. You can use the diagnostics 'Network Test' to test the ability of the
|
||||
adapter to communicate across the Ethernet with another PC equipped with a
|
||||
CS8900/20-based adapter card (it must also be running the CS8900/20 Setup
|
||||
Utility).
|
||||
|
||||
NOTE: The Setup Utility's diagnostics are designed to run in a
|
||||
DOS-only operating system environment. DO NOT run the diagnostics
|
||||
from a DOS or command prompt session under Windows 95, Windows NT,
|
||||
OS/2, or other operating system.
|
||||
|
||||
To run the diagnostics tests on the CS8900/20 adapter:
|
||||
|
||||
1.) Boot DOS on the PC and start the CS8900/20 Setup Utility.
|
||||
|
||||
2.) The adapter's current configuration is displayed. Hit the ENTER key to
|
||||
get to the main menu.
|
||||
|
||||
4.) Select 'Diagnostics' (ALT-G) from the main menu.
|
||||
* Select 'Self-Test' to test the adapter's basic functionality.
|
||||
* Select 'Network Test' to test the network connection and cabling.
|
||||
|
||||
|
||||
5.2.1 DIAGNOSTIC SELF-TEST
|
||||
|
||||
The diagnostic self-test checks the adapter's basic functionality as well as
|
||||
its ability to communicate across the ISA bus based on the system resources
|
||||
assigned during hardware configuration. The following tests are performed:
|
||||
|
||||
* IO Register Read/Write Test
|
||||
The IO Register Read/Write test insures that the CS8900/20 can be
|
||||
accessed in IO mode, and that the IO base address is correct.
|
||||
|
||||
* Shared Memory Test
|
||||
The Shared Memory test insures the CS8900/20 can be accessed in memory
|
||||
mode and that the range of memory addresses assigned does not conflict
|
||||
with other devices in the system.
|
||||
|
||||
* Interrupt Test
|
||||
The Interrupt test insures there are no conflicts with the assigned IRQ
|
||||
signal.
|
||||
|
||||
* EEPROM Test
|
||||
The EEPROM test insures the EEPROM can be read.
|
||||
|
||||
* Chip RAM Test
|
||||
The Chip RAM test insures the 4K of memory internal to the CS8900/20 is
|
||||
working properly.
|
||||
|
||||
* Internal Loop-back Test
|
||||
The Internal Loop Back test insures the adapter's transmitter and
|
||||
receiver are operating properly. If this test fails, make sure the
|
||||
adapter's cable is connected to the network (check for LED activity for
|
||||
example).
|
||||
|
||||
* Boot PROM Test
|
||||
The Boot PROM test insures the Boot PROM is present, and can be read.
|
||||
Failure indicates the Boot PROM was not successfully read due to a
|
||||
hardware problem or due to a conflicts on the Boot PROM address
|
||||
assignment. (Test only applies if the adapter is configured to use the
|
||||
Boot PROM option.)
|
||||
|
||||
Failure of a test item indicates a possible system resource conflict with
|
||||
another device on the ISA bus. In this case, you should use the Manual Setup
|
||||
option to reconfigure the adapter by selecting a different value for the system
|
||||
resource that failed.
|
||||
|
||||
|
||||
5.2.2 DIAGNOSTIC NETWORK TEST
|
||||
|
||||
The Diagnostic Network Test verifies a working network connection by
|
||||
transferring data between two CS8900/20 adapters installed in different PCs
|
||||
on the same network. (Note: the diagnostic network test should not be run
|
||||
between two nodes across a router.)
|
||||
|
||||
This test requires that each of the two PCs have a CS8900/20-based adapter
|
||||
installed and have the CS8900/20 Setup Utility running. The first PC is
|
||||
configured as a Responder and the other PC is configured as an Initiator.
|
||||
Once the Initiator is started, it sends data frames to the Responder which
|
||||
returns the frames to the Initiator.
|
||||
|
||||
The total number of frames received and transmitted are displayed on the
|
||||
Initiator's display, along with a count of the number of frames received and
|
||||
transmitted OK or in error. The test can be terminated anytime by the user at
|
||||
either PC.
|
||||
|
||||
To setup the Diagnostic Network Test:
|
||||
|
||||
1.) Select a PC with a CS8900/20-based adapter and a known working network
|
||||
connection to act as the Responder. Run the CS8900/20 Setup Utility
|
||||
and select 'Diagnostics -> Network Test -> Responder' from the main
|
||||
menu. Hit ENTER to start the Responder.
|
||||
|
||||
2.) Return to the PC with the CS8900/20-based adapter you want to test and
|
||||
start the CS8900/20 Setup Utility.
|
||||
|
||||
3.) From the main menu, Select 'Diagnostic -> Network Test -> Initiator'.
|
||||
Hit ENTER to start the test.
|
||||
|
||||
You may stop the test on the Initiator at any time while allowing the Responder
|
||||
to continue running. In this manner, you can move to additional PCs and test
|
||||
them by starting the Initiator on another PC without having to stop/start the
|
||||
Responder.
|
||||
|
||||
|
||||
|
||||
5.3 USING THE ADAPTER'S LEDs
|
||||
|
||||
The 2 and 3-media adapters have two LEDs visible on the back end of the board
|
||||
located near the 10Base-T connector.
|
||||
|
||||
Link Integrity LED: A "steady" ON of the green LED indicates a valid 10Base-T
|
||||
connection. (Only applies to 10Base-T. The green LED has no significance for
|
||||
a 10Base-2 or AUI connection.)
|
||||
|
||||
TX/RX LED: The yellow LED lights briefly each time the adapter transmits or
|
||||
receives data. (The yellow LED will appear to "flicker" on a typical network.)
|
||||
|
||||
|
||||
5.4 RESOLVING I/O CONFLICTS
|
||||
|
||||
An IO conflict occurs when two or more adapter use the same ISA resource (IO
|
||||
address, memory address or IRQ). You can usually detect an IO conflict in one
|
||||
of four ways after installing and or configuring the CS8900/20-based adapter:
|
||||
|
||||
1.) The system does not boot properly (or at all).
|
||||
|
||||
2.) The driver cannot communicate with the adapter, reporting an "Adapter
|
||||
not found" error message.
|
||||
|
||||
3.) You cannot connect to the network or the driver will not load.
|
||||
|
||||
4.) If you have configured the adapter to run in memory mode but the driver
|
||||
reports it is using IO mode when loading, this is an indication of a
|
||||
memory address conflict.
|
||||
|
||||
If an IO conflict occurs, run the CS8900/20 Setup Utility and perform a
|
||||
diagnostic self-test. Normally, the ISA resource in conflict will fail the
|
||||
self-test. If so, reconfigure the adapter selecting another choice for the
|
||||
resource in conflict. Run the diagnostics again to check for further IO
|
||||
conflicts.
|
||||
|
||||
In some cases, such as when the PC will not boot, it may be necessary to remove
|
||||
the adapter and reconfigure it by installing it in another PC to run the
|
||||
CS8900/20 Setup Utility. Once reinstalled in the target system, run the
|
||||
diagnostics self-test to ensure the new configuration is free of conflicts
|
||||
before loading the driver again.
|
||||
|
||||
When manually configuring the adapter, keep in mind the typical ISA system
|
||||
resource usage as indicated in the tables below.
|
||||
|
||||
I/O Address Device IRQ Device
|
||||
----------- -------- --- --------
|
||||
200-20F Game I/O adapter 3 COM2, Bus Mouse
|
||||
230-23F Bus Mouse 4 COM1
|
||||
270-27F LPT3: third parallel port 5 LPT2
|
||||
2F0-2FF COM2: second serial port 6 Floppy Disk controller
|
||||
320-32F Fixed disk controller 7 LPT1
|
||||
8 Real-time Clock
|
||||
9 EGA/VGA display adapter
|
||||
12 Mouse (PS/2)
|
||||
Memory Address Device 13 Math Coprocessor
|
||||
-------------- --------------------- 14 Hard Disk controller
|
||||
A000-BFFF EGA Graphics Adapter
|
||||
A000-C7FF VGA Graphics Adapter
|
||||
B000-BFFF Mono Graphics Adapter
|
||||
B800-BFFF Color Graphics Adapter
|
||||
E000-FFFF AT BIOS
|
||||
|
||||
|
||||
|
||||
|
||||
6.0 TECHNICAL SUPPORT
|
||||
===============================================================================
|
||||
|
||||
6.1 CONTACTING CIRRUS LOGIC'S TECHNICAL SUPPORT
|
||||
|
||||
Cirrus Logic's CS89XX Technical Application Support can be reached at:
|
||||
|
||||
Telephone :(800) 888-5016 (from inside U.S. and Canada)
|
||||
:(512) 442-7555 (from outside the U.S. and Canada)
|
||||
Fax :(512) 912-3871
|
||||
Email :ethernet@crystal.cirrus.com
|
||||
WWW :http://www.cirrus.com
|
||||
|
||||
|
||||
6.2 INFORMATION REQUIRED BEFORE CONTACTING TECHNICAL SUPPORT
|
||||
|
||||
Before contacting Cirrus Logic for technical support, be prepared to provide as
|
||||
Much of the following information as possible.
|
||||
|
||||
1.) Adapter type (CRD8900, CDB8900, CDB8920, etc.)
|
||||
|
||||
2.) Adapter configuration
|
||||
|
||||
* IO Base, Memory Base, IO or memory mode enabled, IRQ, DMA channel
|
||||
* Plug and Play enabled/disabled (CS8920-based adapters only)
|
||||
* Configured for media auto-detect or specific media type (which type).
|
||||
|
||||
3.) PC System's Configuration
|
||||
|
||||
* Plug and Play system (yes/no)
|
||||
* BIOS (make and version)
|
||||
* System make and model
|
||||
* CPU (type and speed)
|
||||
* System RAM
|
||||
* SCSI Adapter
|
||||
|
||||
4.) Software
|
||||
|
||||
* CS89XX driver and version
|
||||
* Your network operating system and version
|
||||
* Your system's OS version
|
||||
* Version of all protocol support files
|
||||
|
||||
5.) Any Error Message displayed.
|
||||
|
||||
|
||||
|
||||
6.3 OBTAINING THE LATEST DRIVER VERSION
|
||||
|
||||
You can obtain the latest CS89XX drivers and support software from Cirrus Logic's
|
||||
Web site. You can also contact Cirrus Logic's Technical Support (email:
|
||||
ethernet@crystal.cirrus.com) and request that you be registered for automatic
|
||||
software-update notification.
|
||||
|
||||
Cirrus Logic maintains a web page at http://www.cirrus.com with the
|
||||
latest drivers and technical publications.
|
||||
|
||||
|
||||
6.4 Current maintainer
|
||||
|
||||
In February 2000 the maintenance of this driver was assumed by Andrew
|
||||
Morton.
|
||||
|
||||
6.5 Kernel module parameters
|
||||
|
||||
For use in embedded environments with no cs89x0 EEPROM, the kernel boot
|
||||
parameter `cs89x0_media=' has been implemented. Usage is:
|
||||
|
||||
cs89x0_media=rj45 or
|
||||
cs89x0_media=aui or
|
||||
cs89x0_media=bnc
|
||||
|
||||
48
Documentation/networking/cxacru-cf.py
Normal file
48
Documentation/networking/cxacru-cf.py
Normal file
|
|
@ -0,0 +1,48 @@
|
|||
#!/usr/bin/env python
|
||||
# Copyright 2009 Simon Arlott
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify it
|
||||
# under the terms of the GNU General Public License as published by the Free
|
||||
# Software Foundation; either version 2 of the License, or (at your option)
|
||||
# any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful, but WITHOUT
|
||||
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
||||
# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
|
||||
# more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License along with
|
||||
# this program; if not, write to the Free Software Foundation, Inc., 59
|
||||
# Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
#
|
||||
# Usage: cxacru-cf.py < cxacru-cf.bin
|
||||
# Output: values string suitable for the sysfs adsl_config attribute
|
||||
#
|
||||
# Warning: cxacru-cf.bin with MD5 hash cdbac2689969d5ed5d4850f117702110
|
||||
# contains mis-aligned values which will stop the modem from being able
|
||||
# to make a connection. If the first and last two bytes are removed then
|
||||
# the values become valid, but the modulation will be forced to ANSI
|
||||
# T1.413 only which may not be appropriate.
|
||||
#
|
||||
# The original binary format is a packed list of le32 values.
|
||||
|
||||
import sys
|
||||
import struct
|
||||
|
||||
i = 0
|
||||
while True:
|
||||
buf = sys.stdin.read(4)
|
||||
|
||||
if len(buf) == 0:
|
||||
break
|
||||
elif len(buf) != 4:
|
||||
sys.stdout.write("\n")
|
||||
sys.stderr.write("Error: read {0} not 4 bytes\n".format(len(buf)))
|
||||
sys.exit(1)
|
||||
|
||||
if i > 0:
|
||||
sys.stdout.write(" ")
|
||||
sys.stdout.write("{0:x}={1}".format(i, struct.unpack("<I", buf)[0]))
|
||||
i += 1
|
||||
|
||||
sys.stdout.write("\n")
|
||||
100
Documentation/networking/cxacru.txt
Normal file
100
Documentation/networking/cxacru.txt
Normal file
|
|
@ -0,0 +1,100 @@
|
|||
Firmware is required for this device: http://accessrunner.sourceforge.net/
|
||||
|
||||
While it is capable of managing/maintaining the ADSL connection without the
|
||||
module loaded, the device will sometimes stop responding after unloading the
|
||||
driver and it is necessary to unplug/remove power to the device to fix this.
|
||||
|
||||
Note: support for cxacru-cf.bin has been removed. It was not loaded correctly
|
||||
so it had no effect on the device configuration. Fixing it could have stopped
|
||||
existing devices working when an invalid configuration is supplied.
|
||||
|
||||
There is a script cxacru-cf.py to convert an existing file to the sysfs form.
|
||||
|
||||
Detected devices will appear as ATM devices named "cxacru". In /sys/class/atm/
|
||||
these are directories named cxacruN where N is the device number. A symlink
|
||||
named device points to the USB interface device's directory which contains
|
||||
several sysfs attribute files for retrieving device statistics:
|
||||
|
||||
* adsl_controller_version
|
||||
|
||||
* adsl_headend
|
||||
* adsl_headend_environment
|
||||
Information about the remote headend.
|
||||
|
||||
* adsl_config
|
||||
Configuration writing interface.
|
||||
Write parameters in hexadecimal format <index>=<value>,
|
||||
separated by whitespace, e.g.:
|
||||
"1=0 a=5"
|
||||
Up to 7 parameters at a time will be sent and the modem will restart
|
||||
the ADSL connection when any value is set. These are logged for future
|
||||
reference.
|
||||
|
||||
* downstream_attenuation (dB)
|
||||
* downstream_bits_per_frame
|
||||
* downstream_rate (kbps)
|
||||
* downstream_snr_margin (dB)
|
||||
Downstream stats.
|
||||
|
||||
* upstream_attenuation (dB)
|
||||
* upstream_bits_per_frame
|
||||
* upstream_rate (kbps)
|
||||
* upstream_snr_margin (dB)
|
||||
* transmitter_power (dBm/Hz)
|
||||
Upstream stats.
|
||||
|
||||
* downstream_crc_errors
|
||||
* downstream_fec_errors
|
||||
* downstream_hec_errors
|
||||
* upstream_crc_errors
|
||||
* upstream_fec_errors
|
||||
* upstream_hec_errors
|
||||
Error counts.
|
||||
|
||||
* line_startable
|
||||
Indicates that ADSL support on the device
|
||||
is/can be enabled, see adsl_start.
|
||||
|
||||
* line_status
|
||||
"initialising"
|
||||
"down"
|
||||
"attempting to activate"
|
||||
"training"
|
||||
"channel analysis"
|
||||
"exchange"
|
||||
"waiting"
|
||||
"up"
|
||||
|
||||
Changes between "down" and "attempting to activate"
|
||||
if there is no signal.
|
||||
|
||||
* link_status
|
||||
"not connected"
|
||||
"connected"
|
||||
"lost"
|
||||
|
||||
* mac_address
|
||||
|
||||
* modulation
|
||||
"" (when not connected)
|
||||
"ANSI T1.413"
|
||||
"ITU-T G.992.1 (G.DMT)"
|
||||
"ITU-T G.992.2 (G.LITE)"
|
||||
|
||||
* startup_attempts
|
||||
Count of total attempts to initialise ADSL.
|
||||
|
||||
To enable/disable ADSL, the following can be written to the adsl_state file:
|
||||
"start"
|
||||
"stop
|
||||
"restart" (stops, waits 1.5s, then starts)
|
||||
"poll" (used to resume status polling if it was disabled due to failure)
|
||||
|
||||
Changes in adsl/line state are reported via kernel log messages:
|
||||
[4942145.150704] ATM dev 0: ADSL state: running
|
||||
[4942243.663766] ATM dev 0: ADSL line: down
|
||||
[4942249.665075] ATM dev 0: ADSL line: attempting to activate
|
||||
[4942253.654954] ATM dev 0: ADSL line: training
|
||||
[4942255.666387] ATM dev 0: ADSL line: channel analysis
|
||||
[4942259.656262] ATM dev 0: ADSL line: exchange
|
||||
[2635357.696901] ATM dev 0: ADSL line: up (8128 kb/s down | 832 kb/s up)
|
||||
352
Documentation/networking/cxgb.txt
Normal file
352
Documentation/networking/cxgb.txt
Normal file
|
|
@ -0,0 +1,352 @@
|
|||
Chelsio N210 10Gb Ethernet Network Controller
|
||||
|
||||
Driver Release Notes for Linux
|
||||
|
||||
Version 2.1.1
|
||||
|
||||
June 20, 2005
|
||||
|
||||
CONTENTS
|
||||
========
|
||||
INTRODUCTION
|
||||
FEATURES
|
||||
PERFORMANCE
|
||||
DRIVER MESSAGES
|
||||
KNOWN ISSUES
|
||||
SUPPORT
|
||||
|
||||
|
||||
INTRODUCTION
|
||||
============
|
||||
|
||||
This document describes the Linux driver for Chelsio 10Gb Ethernet Network
|
||||
Controller. This driver supports the Chelsio N210 NIC and is backward
|
||||
compatible with the Chelsio N110 model 10Gb NICs.
|
||||
|
||||
|
||||
FEATURES
|
||||
========
|
||||
|
||||
Adaptive Interrupts (adaptive-rx)
|
||||
---------------------------------
|
||||
|
||||
This feature provides an adaptive algorithm that adjusts the interrupt
|
||||
coalescing parameters, allowing the driver to dynamically adapt the latency
|
||||
settings to achieve the highest performance during various types of network
|
||||
load.
|
||||
|
||||
The interface used to control this feature is ethtool. Please see the
|
||||
ethtool manpage for additional usage information.
|
||||
|
||||
By default, adaptive-rx is disabled.
|
||||
To enable adaptive-rx:
|
||||
|
||||
ethtool -C <interface> adaptive-rx on
|
||||
|
||||
To disable adaptive-rx, use ethtool:
|
||||
|
||||
ethtool -C <interface> adaptive-rx off
|
||||
|
||||
After disabling adaptive-rx, the timer latency value will be set to 50us.
|
||||
You may set the timer latency after disabling adaptive-rx:
|
||||
|
||||
ethtool -C <interface> rx-usecs <microseconds>
|
||||
|
||||
An example to set the timer latency value to 100us on eth0:
|
||||
|
||||
ethtool -C eth0 rx-usecs 100
|
||||
|
||||
You may also provide a timer latency value while disabling adaptive-rx:
|
||||
|
||||
ethtool -C <interface> adaptive-rx off rx-usecs <microseconds>
|
||||
|
||||
If adaptive-rx is disabled and a timer latency value is specified, the timer
|
||||
will be set to the specified value until changed by the user or until
|
||||
adaptive-rx is enabled.
|
||||
|
||||
To view the status of the adaptive-rx and timer latency values:
|
||||
|
||||
ethtool -c <interface>
|
||||
|
||||
|
||||
TCP Segmentation Offloading (TSO) Support
|
||||
-----------------------------------------
|
||||
|
||||
This feature, also known as "large send", enables a system's protocol stack
|
||||
to offload portions of outbound TCP processing to a network interface card
|
||||
thereby reducing system CPU utilization and enhancing performance.
|
||||
|
||||
The interface used to control this feature is ethtool version 1.8 or higher.
|
||||
Please see the ethtool manpage for additional usage information.
|
||||
|
||||
By default, TSO is enabled.
|
||||
To disable TSO:
|
||||
|
||||
ethtool -K <interface> tso off
|
||||
|
||||
To enable TSO:
|
||||
|
||||
ethtool -K <interface> tso on
|
||||
|
||||
To view the status of TSO:
|
||||
|
||||
ethtool -k <interface>
|
||||
|
||||
|
||||
PERFORMANCE
|
||||
===========
|
||||
|
||||
The following information is provided as an example of how to change system
|
||||
parameters for "performance tuning" an what value to use. You may or may not
|
||||
want to change these system parameters, depending on your server/workstation
|
||||
application. Doing so is not warranted in any way by Chelsio Communications,
|
||||
and is done at "YOUR OWN RISK". Chelsio will not be held responsible for loss
|
||||
of data or damage to equipment.
|
||||
|
||||
Your distribution may have a different way of doing things, or you may prefer
|
||||
a different method. These commands are shown only to provide an example of
|
||||
what to do and are by no means definitive.
|
||||
|
||||
Making any of the following system changes will only last until you reboot
|
||||
your system. You may want to write a script that runs at boot-up which
|
||||
includes the optimal settings for your system.
|
||||
|
||||
Setting PCI Latency Timer:
|
||||
setpci -d 1425:* 0x0c.l=0x0000F800
|
||||
|
||||
Disabling TCP timestamp:
|
||||
sysctl -w net.ipv4.tcp_timestamps=0
|
||||
|
||||
Disabling SACK:
|
||||
sysctl -w net.ipv4.tcp_sack=0
|
||||
|
||||
Setting large number of incoming connection requests:
|
||||
sysctl -w net.ipv4.tcp_max_syn_backlog=3000
|
||||
|
||||
Setting maximum receive socket buffer size:
|
||||
sysctl -w net.core.rmem_max=1024000
|
||||
|
||||
Setting maximum send socket buffer size:
|
||||
sysctl -w net.core.wmem_max=1024000
|
||||
|
||||
Set smp_affinity (on a multiprocessor system) to a single CPU:
|
||||
echo 1 > /proc/irq/<interrupt_number>/smp_affinity
|
||||
|
||||
Setting default receive socket buffer size:
|
||||
sysctl -w net.core.rmem_default=524287
|
||||
|
||||
Setting default send socket buffer size:
|
||||
sysctl -w net.core.wmem_default=524287
|
||||
|
||||
Setting maximum option memory buffers:
|
||||
sysctl -w net.core.optmem_max=524287
|
||||
|
||||
Setting maximum backlog (# of unprocessed packets before kernel drops):
|
||||
sysctl -w net.core.netdev_max_backlog=300000
|
||||
|
||||
Setting TCP read buffers (min/default/max):
|
||||
sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
|
||||
|
||||
Setting TCP write buffers (min/pressure/max):
|
||||
sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"
|
||||
|
||||
Setting TCP buffer space (min/pressure/max):
|
||||
sysctl -w net.ipv4.tcp_mem="10000000 10000000 10000000"
|
||||
|
||||
TCP window size for single connections:
|
||||
The receive buffer (RX_WINDOW) size must be at least as large as the
|
||||
Bandwidth-Delay Product of the communication link between the sender and
|
||||
receiver. Due to the variations of RTT, you may want to increase the buffer
|
||||
size up to 2 times the Bandwidth-Delay Product. Reference page 289 of
|
||||
"TCP/IP Illustrated, Volume 1, The Protocols" by W. Richard Stevens.
|
||||
At 10Gb speeds, use the following formula:
|
||||
RX_WINDOW >= 1.25MBytes * RTT(in milliseconds)
|
||||
Example for RTT with 100us: RX_WINDOW = (1,250,000 * 0.1) = 125,000
|
||||
RX_WINDOW sizes of 256KB - 512KB should be sufficient.
|
||||
Setting the min, max, and default receive buffer (RX_WINDOW) size:
|
||||
sysctl -w net.ipv4.tcp_rmem="<min> <default> <max>"
|
||||
|
||||
TCP window size for multiple connections:
|
||||
The receive buffer (RX_WINDOW) size may be calculated the same as single
|
||||
connections, but should be divided by the number of connections. The
|
||||
smaller window prevents congestion and facilitates better pacing,
|
||||
especially if/when MAC level flow control does not work well or when it is
|
||||
not supported on the machine. Experimentation may be necessary to attain
|
||||
the correct value. This method is provided as a starting point for the
|
||||
correct receive buffer size.
|
||||
Setting the min, max, and default receive buffer (RX_WINDOW) size is
|
||||
performed in the same manner as single connection.
|
||||
|
||||
|
||||
DRIVER MESSAGES
|
||||
===============
|
||||
|
||||
The following messages are the most common messages logged by syslog. These
|
||||
may be found in /var/log/messages.
|
||||
|
||||
Driver up:
|
||||
Chelsio Network Driver - version 2.1.1
|
||||
|
||||
NIC detected:
|
||||
eth#: Chelsio N210 1x10GBaseX NIC (rev #), PCIX 133MHz/64-bit
|
||||
|
||||
Link up:
|
||||
eth#: link is up at 10 Gbps, full duplex
|
||||
|
||||
Link down:
|
||||
eth#: link is down
|
||||
|
||||
|
||||
KNOWN ISSUES
|
||||
============
|
||||
|
||||
These issues have been identified during testing. The following information
|
||||
is provided as a workaround to the problem. In some cases, this problem is
|
||||
inherent to Linux or to a particular Linux Distribution and/or hardware
|
||||
platform.
|
||||
|
||||
1. Large number of TCP retransmits on a multiprocessor (SMP) system.
|
||||
|
||||
On a system with multiple CPUs, the interrupt (IRQ) for the network
|
||||
controller may be bound to more than one CPU. This will cause TCP
|
||||
retransmits if the packet data were to be split across different CPUs
|
||||
and re-assembled in a different order than expected.
|
||||
|
||||
To eliminate the TCP retransmits, set smp_affinity on the particular
|
||||
interrupt to a single CPU. You can locate the interrupt (IRQ) used on
|
||||
the N110/N210 by using ifconfig:
|
||||
ifconfig <dev_name> | grep Interrupt
|
||||
Set the smp_affinity to a single CPU:
|
||||
echo 1 > /proc/irq/<interrupt_number>/smp_affinity
|
||||
|
||||
It is highly suggested that you do not run the irqbalance daemon on your
|
||||
system, as this will change any smp_affinity setting you have applied.
|
||||
The irqbalance daemon runs on a 10 second interval and binds interrupts
|
||||
to the least loaded CPU determined by the daemon. To disable this daemon:
|
||||
chkconfig --level 2345 irqbalance off
|
||||
|
||||
By default, some Linux distributions enable the kernel feature,
|
||||
irqbalance, which performs the same function as the daemon. To disable
|
||||
this feature, add the following line to your bootloader:
|
||||
noirqbalance
|
||||
|
||||
Example using the Grub bootloader:
|
||||
title Red Hat Enterprise Linux AS (2.4.21-27.ELsmp)
|
||||
root (hd0,0)
|
||||
kernel /vmlinuz-2.4.21-27.ELsmp ro root=/dev/hda3 noirqbalance
|
||||
initrd /initrd-2.4.21-27.ELsmp.img
|
||||
|
||||
2. After running insmod, the driver is loaded and the incorrect network
|
||||
interface is brought up without running ifup.
|
||||
|
||||
When using 2.4.x kernels, including RHEL kernels, the Linux kernel
|
||||
invokes a script named "hotplug". This script is primarily used to
|
||||
automatically bring up USB devices when they are plugged in, however,
|
||||
the script also attempts to automatically bring up a network interface
|
||||
after loading the kernel module. The hotplug script does this by scanning
|
||||
the ifcfg-eth# config files in /etc/sysconfig/network-scripts, looking
|
||||
for HWADDR=<mac_address>.
|
||||
|
||||
If the hotplug script does not find the HWADDRR within any of the
|
||||
ifcfg-eth# files, it will bring up the device with the next available
|
||||
interface name. If this interface is already configured for a different
|
||||
network card, your new interface will have incorrect IP address and
|
||||
network settings.
|
||||
|
||||
To solve this issue, you can add the HWADDR=<mac_address> key to the
|
||||
interface config file of your network controller.
|
||||
|
||||
To disable this "hotplug" feature, you may add the driver (module name)
|
||||
to the "blacklist" file located in /etc/hotplug. It has been noted that
|
||||
this does not work for network devices because the net.agent script
|
||||
does not use the blacklist file. Simply remove, or rename, the net.agent
|
||||
script located in /etc/hotplug to disable this feature.
|
||||
|
||||
3. Transport Protocol (TP) hangs when running heavy multi-connection traffic
|
||||
on an AMD Opteron system with HyperTransport PCI-X Tunnel chipset.
|
||||
|
||||
If your AMD Opteron system uses the AMD-8131 HyperTransport PCI-X Tunnel
|
||||
chipset, you may experience the "133-Mhz Mode Split Completion Data
|
||||
Corruption" bug identified by AMD while using a 133Mhz PCI-X card on the
|
||||
bus PCI-X bus.
|
||||
|
||||
AMD states, "Under highly specific conditions, the AMD-8131 PCI-X Tunnel
|
||||
can provide stale data via split completion cycles to a PCI-X card that
|
||||
is operating at 133 Mhz", causing data corruption.
|
||||
|
||||
AMD's provides three workarounds for this problem, however, Chelsio
|
||||
recommends the first option for best performance with this bug:
|
||||
|
||||
For 133Mhz secondary bus operation, limit the transaction length and
|
||||
the number of outstanding transactions, via BIOS configuration
|
||||
programming of the PCI-X card, to the following:
|
||||
|
||||
Data Length (bytes): 1k
|
||||
Total allowed outstanding transactions: 2
|
||||
|
||||
Please refer to AMD 8131-HT/PCI-X Errata 26310 Rev 3.08 August 2004,
|
||||
section 56, "133-MHz Mode Split Completion Data Corruption" for more
|
||||
details with this bug and workarounds suggested by AMD.
|
||||
|
||||
It may be possible to work outside AMD's recommended PCI-X settings, try
|
||||
increasing the Data Length to 2k bytes for increased performance. If you
|
||||
have issues with these settings, please revert to the "safe" settings
|
||||
and duplicate the problem before submitting a bug or asking for support.
|
||||
|
||||
NOTE: The default setting on most systems is 8 outstanding transactions
|
||||
and 2k bytes data length.
|
||||
|
||||
4. On multiprocessor systems, it has been noted that an application which
|
||||
is handling 10Gb networking can switch between CPUs causing degraded
|
||||
and/or unstable performance.
|
||||
|
||||
If running on an SMP system and taking performance measurements, it
|
||||
is suggested you either run the latest netperf-2.4.0+ or use a binding
|
||||
tool such as Tim Hockin's procstate utilities (runon)
|
||||
<http://www.hockin.org/~thockin/procstate/>.
|
||||
|
||||
Binding netserver and netperf (or other applications) to particular
|
||||
CPUs will have a significant difference in performance measurements.
|
||||
You may need to experiment which CPU to bind the application to in
|
||||
order to achieve the best performance for your system.
|
||||
|
||||
If you are developing an application designed for 10Gb networking,
|
||||
please keep in mind you may want to look at kernel functions
|
||||
sched_setaffinity & sched_getaffinity to bind your application.
|
||||
|
||||
If you are just running user-space applications such as ftp, telnet,
|
||||
etc., you may want to try the runon tool provided by Tim Hockin's
|
||||
procstate utility. You could also try binding the interface to a
|
||||
particular CPU: runon 0 ifup eth0
|
||||
|
||||
|
||||
SUPPORT
|
||||
=======
|
||||
|
||||
If you have problems with the software or hardware, please contact our
|
||||
customer support team via email at support@chelsio.com or check our website
|
||||
at http://www.chelsio.com
|
||||
|
||||
===============================================================================
|
||||
|
||||
Chelsio Communications
|
||||
370 San Aleso Ave.
|
||||
Suite 100
|
||||
Sunnyvale, CA 94085
|
||||
http://www.chelsio.com
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License, version 2, as
|
||||
published by the Free Software Foundation.
|
||||
|
||||
You should have received a copy of the GNU General Public License along
|
||||
with this program; if not, write to the Free Software Foundation, Inc.,
|
||||
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
|
||||
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
|
||||
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
|
||||
|
||||
Copyright (c) 2003-2005 Chelsio Communications. All rights reserved.
|
||||
|
||||
===============================================================================
|
||||
207
Documentation/networking/dccp.txt
Normal file
207
Documentation/networking/dccp.txt
Normal file
|
|
@ -0,0 +1,207 @@
|
|||
DCCP protocol
|
||||
=============
|
||||
|
||||
|
||||
Contents
|
||||
========
|
||||
- Introduction
|
||||
- Missing features
|
||||
- Socket options
|
||||
- Sysctl variables
|
||||
- IOCTLs
|
||||
- Other tunables
|
||||
- Notes
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
Datagram Congestion Control Protocol (DCCP) is an unreliable, connection
|
||||
oriented protocol designed to solve issues present in UDP and TCP, particularly
|
||||
for real-time and multimedia (streaming) traffic.
|
||||
It divides into a base protocol (RFC 4340) and pluggable congestion control
|
||||
modules called CCIDs. Like pluggable TCP congestion control, at least one CCID
|
||||
needs to be enabled in order for the protocol to function properly. In the Linux
|
||||
implementation, this is the TCP-like CCID2 (RFC 4341). Additional CCIDs, such as
|
||||
the TCP-friendly CCID3 (RFC 4342), are optional.
|
||||
For a brief introduction to CCIDs and suggestions for choosing a CCID to match
|
||||
given applications, see section 10 of RFC 4340.
|
||||
|
||||
It has a base protocol and pluggable congestion control IDs (CCIDs).
|
||||
|
||||
DCCP is a Proposed Standard (RFC 2026), and the homepage for DCCP as a protocol
|
||||
is at http://www.ietf.org/html.charters/dccp-charter.html
|
||||
|
||||
|
||||
Missing features
|
||||
================
|
||||
The Linux DCCP implementation does not currently support all the features that are
|
||||
specified in RFCs 4340...42.
|
||||
|
||||
The known bugs are at:
|
||||
http://www.linuxfoundation.org/collaborate/workgroups/networking/todo#DCCP
|
||||
|
||||
For more up-to-date versions of the DCCP implementation, please consider using
|
||||
the experimental DCCP test tree; instructions for checking this out are on:
|
||||
http://www.linuxfoundation.org/collaborate/workgroups/networking/dccp_testing#Experimental_DCCP_source_tree
|
||||
|
||||
|
||||
Socket options
|
||||
==============
|
||||
DCCP_SOCKOPT_QPOLICY_ID sets the dequeuing policy for outgoing packets. It takes
|
||||
a policy ID as argument and can only be set before the connection (i.e. changes
|
||||
during an established connection are not supported). Currently, two policies are
|
||||
defined: the "simple" policy (DCCPQ_POLICY_SIMPLE), which does nothing special,
|
||||
and a priority-based variant (DCCPQ_POLICY_PRIO). The latter allows to pass an
|
||||
u32 priority value as ancillary data to sendmsg(), where higher numbers indicate
|
||||
a higher packet priority (similar to SO_PRIORITY). This ancillary data needs to
|
||||
be formatted using a cmsg(3) message header filled in as follows:
|
||||
cmsg->cmsg_level = SOL_DCCP;
|
||||
cmsg->cmsg_type = DCCP_SCM_PRIORITY;
|
||||
cmsg->cmsg_len = CMSG_LEN(sizeof(uint32_t)); /* or CMSG_LEN(4) */
|
||||
|
||||
DCCP_SOCKOPT_QPOLICY_TXQLEN sets the maximum length of the output queue. A zero
|
||||
value is always interpreted as unbounded queue length. If different from zero,
|
||||
the interpretation of this parameter depends on the current dequeuing policy
|
||||
(see above): the "simple" policy will enforce a fixed queue size by returning
|
||||
EAGAIN, whereas the "prio" policy enforces a fixed queue length by dropping the
|
||||
lowest-priority packet first. The default value for this parameter is
|
||||
initialised from /proc/sys/net/dccp/default/tx_qlen.
|
||||
|
||||
DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of
|
||||
service codes (RFC 4340, sec. 8.1.2); if this socket option is not set,
|
||||
the socket will fall back to 0 (which means that no meaningful service code
|
||||
is present). On active sockets this is set before connect(); specifying more
|
||||
than one code has no effect (all subsequent service codes are ignored). The
|
||||
case is different for passive sockets, where multiple service codes (up to 32)
|
||||
can be set before calling bind().
|
||||
|
||||
DCCP_SOCKOPT_GET_CUR_MPS is read-only and retrieves the current maximum packet
|
||||
size (application payload size) in bytes, see RFC 4340, section 14.
|
||||
|
||||
DCCP_SOCKOPT_AVAILABLE_CCIDS is also read-only and returns the list of CCIDs
|
||||
supported by the endpoint. The option value is an array of type uint8_t whose
|
||||
size is passed as option length. The minimum array size is 4 elements, the
|
||||
value returned in the optlen argument always reflects the true number of
|
||||
built-in CCIDs.
|
||||
|
||||
DCCP_SOCKOPT_CCID is write-only and sets both the TX and RX CCIDs at the same
|
||||
time, combining the operation of the next two socket options. This option is
|
||||
preferable over the latter two, since often applications will use the same
|
||||
type of CCID for both directions; and mixed use of CCIDs is not currently well
|
||||
understood. This socket option takes as argument at least one uint8_t value, or
|
||||
an array of uint8_t values, which must match available CCIDS (see above). CCIDs
|
||||
must be registered on the socket before calling connect() or listen().
|
||||
|
||||
DCCP_SOCKOPT_TX_CCID is read/write. It returns the current CCID (if set) or sets
|
||||
the preference list for the TX CCID, using the same format as DCCP_SOCKOPT_CCID.
|
||||
Please note that the getsockopt argument type here is `int', not uint8_t.
|
||||
|
||||
DCCP_SOCKOPT_RX_CCID is analogous to DCCP_SOCKOPT_TX_CCID, but for the RX CCID.
|
||||
|
||||
DCCP_SOCKOPT_SERVER_TIMEWAIT enables the server (listening socket) to hold
|
||||
timewait state when closing the connection (RFC 4340, 8.3). The usual case is
|
||||
that the closing server sends a CloseReq, whereupon the client holds timewait
|
||||
state. When this boolean socket option is on, the server sends a Close instead
|
||||
and will enter TIMEWAIT. This option must be set after accept() returns.
|
||||
|
||||
DCCP_SOCKOPT_SEND_CSCOV and DCCP_SOCKOPT_RECV_CSCOV are used for setting the
|
||||
partial checksum coverage (RFC 4340, sec. 9.2). The default is that checksums
|
||||
always cover the entire packet and that only fully covered application data is
|
||||
accepted by the receiver. Hence, when using this feature on the sender, it must
|
||||
be enabled at the receiver, too with suitable choice of CsCov.
|
||||
|
||||
DCCP_SOCKOPT_SEND_CSCOV sets the sender checksum coverage. Values in the
|
||||
range 0..15 are acceptable. The default setting is 0 (full coverage),
|
||||
values between 1..15 indicate partial coverage.
|
||||
DCCP_SOCKOPT_RECV_CSCOV is for the receiver and has a different meaning: it
|
||||
sets a threshold, where again values 0..15 are acceptable. The default
|
||||
of 0 means that all packets with a partial coverage will be discarded.
|
||||
Values in the range 1..15 indicate that packets with minimally such a
|
||||
coverage value are also acceptable. The higher the number, the more
|
||||
restrictive this setting (see [RFC 4340, sec. 9.2.1]). Partial coverage
|
||||
settings are inherited to the child socket after accept().
|
||||
|
||||
The following two options apply to CCID 3 exclusively and are getsockopt()-only.
|
||||
In either case, a TFRC info struct (defined in <linux/tfrc.h>) is returned.
|
||||
DCCP_SOCKOPT_CCID_RX_INFO
|
||||
Returns a `struct tfrc_rx_info' in optval; the buffer for optval and
|
||||
optlen must be set to at least sizeof(struct tfrc_rx_info).
|
||||
DCCP_SOCKOPT_CCID_TX_INFO
|
||||
Returns a `struct tfrc_tx_info' in optval; the buffer for optval and
|
||||
optlen must be set to at least sizeof(struct tfrc_tx_info).
|
||||
|
||||
On unidirectional connections it is useful to close the unused half-connection
|
||||
via shutdown (SHUT_WR or SHUT_RD): this will reduce per-packet processing costs.
|
||||
|
||||
|
||||
Sysctl variables
|
||||
================
|
||||
Several DCCP default parameters can be managed by the following sysctls
|
||||
(sysctl net.dccp.default or /proc/sys/net/dccp/default):
|
||||
|
||||
request_retries
|
||||
The number of active connection initiation retries (the number of
|
||||
Requests minus one) before timing out. In addition, it also governs
|
||||
the behaviour of the other, passive side: this variable also sets
|
||||
the number of times DCCP repeats sending a Response when the initial
|
||||
handshake does not progress from RESPOND to OPEN (i.e. when no Ack
|
||||
is received after the initial Request). This value should be greater
|
||||
than 0, suggested is less than 10. Analogue of tcp_syn_retries.
|
||||
|
||||
retries1
|
||||
How often a DCCP Response is retransmitted until the listening DCCP
|
||||
side considers its connecting peer dead. Analogue of tcp_retries1.
|
||||
|
||||
retries2
|
||||
The number of times a general DCCP packet is retransmitted. This has
|
||||
importance for retransmitted acknowledgments and feature negotiation,
|
||||
data packets are never retransmitted. Analogue of tcp_retries2.
|
||||
|
||||
tx_ccid = 2
|
||||
Default CCID for the sender-receiver half-connection. Depending on the
|
||||
choice of CCID, the Send Ack Vector feature is enabled automatically.
|
||||
|
||||
rx_ccid = 2
|
||||
Default CCID for the receiver-sender half-connection; see tx_ccid.
|
||||
|
||||
seq_window = 100
|
||||
The initial sequence window (sec. 7.5.2) of the sender. This influences
|
||||
the local ackno validity and the remote seqno validity windows (7.5.1).
|
||||
Values in the range Wmin = 32 (RFC 4340, 7.5.2) up to 2^32-1 can be set.
|
||||
|
||||
tx_qlen = 5
|
||||
The size of the transmit buffer in packets. A value of 0 corresponds
|
||||
to an unbounded transmit buffer.
|
||||
|
||||
sync_ratelimit = 125 ms
|
||||
The timeout between subsequent DCCP-Sync packets sent in response to
|
||||
sequence-invalid packets on the same socket (RFC 4340, 7.5.4). The unit
|
||||
of this parameter is milliseconds; a value of 0 disables rate-limiting.
|
||||
|
||||
|
||||
IOCTLS
|
||||
======
|
||||
FIONREAD
|
||||
Works as in udp(7): returns in the `int' argument pointer the size of
|
||||
the next pending datagram in bytes, or 0 when no datagram is pending.
|
||||
|
||||
|
||||
Other tunables
|
||||
==============
|
||||
Per-route rto_min support
|
||||
CCID-2 supports the RTAX_RTO_MIN per-route setting for the minimum value
|
||||
of the RTO timer. This setting can be modified via the 'rto_min' option
|
||||
of iproute2; for example:
|
||||
> ip route change 10.0.0.0/24 rto_min 250j dev wlan0
|
||||
> ip route add 10.0.0.254/32 rto_min 800j dev wlan0
|
||||
> ip route show dev wlan0
|
||||
CCID-3 also supports the rto_min setting: it is used to define the lower
|
||||
bound for the expiry of the nofeedback timer. This can be useful on LANs
|
||||
with very low RTTs (e.g., loopback, Gbit ethernet).
|
||||
|
||||
|
||||
Notes
|
||||
=====
|
||||
DCCP does not travel through NAT successfully at present on many boxes. This is
|
||||
because the checksum covers the pseudo-header as per TCP and UDP. Linux NAT
|
||||
support for DCCP has been added.
|
||||
43
Documentation/networking/dctcp.txt
Normal file
43
Documentation/networking/dctcp.txt
Normal file
|
|
@ -0,0 +1,43 @@
|
|||
DCTCP (DataCenter TCP)
|
||||
----------------------
|
||||
|
||||
DCTCP is an enhancement to the TCP congestion control algorithm for data
|
||||
center networks and leverages Explicit Congestion Notification (ECN) in
|
||||
the data center network to provide multi-bit feedback to the end hosts.
|
||||
|
||||
To enable it on end hosts:
|
||||
|
||||
sysctl -w net.ipv4.tcp_congestion_control=dctcp
|
||||
|
||||
All switches in the data center network running DCTCP must support ECN
|
||||
marking and be configured for marking when reaching defined switch buffer
|
||||
thresholds. The default ECN marking threshold heuristic for DCTCP on
|
||||
switches is 20 packets (30KB) at 1Gbps, and 65 packets (~100KB) at 10Gbps,
|
||||
but might need further careful tweaking.
|
||||
|
||||
For more details, see below documents:
|
||||
|
||||
Paper:
|
||||
|
||||
The algorithm is further described in detail in the following two
|
||||
SIGCOMM/SIGMETRICS papers:
|
||||
|
||||
i) Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye,
|
||||
Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan:
|
||||
"Data Center TCP (DCTCP)", Data Center Networks session
|
||||
Proc. ACM SIGCOMM, New Delhi, 2010.
|
||||
http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp-final.pdf
|
||||
http://www.sigcomm.org/ccr/papers/2010/October/1851275.1851192
|
||||
|
||||
ii) Mohammad Alizadeh, Adel Javanmard, and Balaji Prabhakar:
|
||||
"Analysis of DCTCP: Stability, Convergence, and Fairness"
|
||||
Proc. ACM SIGMETRICS, San Jose, 2011.
|
||||
http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp_analysis-full.pdf
|
||||
|
||||
IETF informational draft:
|
||||
|
||||
http://tools.ietf.org/html/draft-bensley-tcpm-dctcp-00
|
||||
|
||||
DCTCP site:
|
||||
|
||||
http://simula.stanford.edu/~alizade/Site/DCTCP.html
|
||||
178
Documentation/networking/de4x5.txt
Normal file
178
Documentation/networking/de4x5.txt
Normal file
|
|
@ -0,0 +1,178 @@
|
|||
Originally, this driver was written for the Digital Equipment
|
||||
Corporation series of EtherWORKS Ethernet cards:
|
||||
|
||||
DE425 TP/COAX EISA
|
||||
DE434 TP PCI
|
||||
DE435 TP/COAX/AUI PCI
|
||||
DE450 TP/COAX/AUI PCI
|
||||
DE500 10/100 PCI Fasternet
|
||||
|
||||
but it will now attempt to support all cards which conform to the
|
||||
Digital Semiconductor SROM Specification. The driver currently
|
||||
recognises the following chips:
|
||||
|
||||
DC21040 (no SROM)
|
||||
DC21041[A]
|
||||
DC21140[A]
|
||||
DC21142
|
||||
DC21143
|
||||
|
||||
So far the driver is known to work with the following cards:
|
||||
|
||||
KINGSTON
|
||||
Linksys
|
||||
ZNYX342
|
||||
SMC8432
|
||||
SMC9332 (w/new SROM)
|
||||
ZNYX31[45]
|
||||
ZNYX346 10/100 4 port (can act as a 10/100 bridge!)
|
||||
|
||||
The driver has been tested on a relatively busy network using the DE425,
|
||||
DE434, DE435 and DE500 cards and benchmarked with 'ttcp': it transferred
|
||||
16M of data to a DECstation 5000/200 as follows:
|
||||
|
||||
TCP UDP
|
||||
TX RX TX RX
|
||||
DE425 1030k 997k 1170k 1128k
|
||||
DE434 1063k 995k 1170k 1125k
|
||||
DE435 1063k 995k 1170k 1125k
|
||||
DE500 1063k 998k 1170k 1125k in 10Mb/s mode
|
||||
|
||||
All values are typical (in kBytes/sec) from a sample of 4 for each
|
||||
measurement. Their error is +/-20k on a quiet (private) network and also
|
||||
depend on what load the CPU has.
|
||||
|
||||
=========================================================================
|
||||
|
||||
The ability to load this driver as a loadable module has been included
|
||||
and used extensively during the driver development (to save those long
|
||||
reboot sequences). Loadable module support under PCI and EISA has been
|
||||
achieved by letting the driver autoprobe as if it were compiled into the
|
||||
kernel. Do make sure you're not sharing interrupts with anything that
|
||||
cannot accommodate interrupt sharing!
|
||||
|
||||
To utilise this ability, you have to do 8 things:
|
||||
|
||||
0) have a copy of the loadable modules code installed on your system.
|
||||
1) copy de4x5.c from the /linux/drivers/net directory to your favourite
|
||||
temporary directory.
|
||||
2) for fixed autoprobes (not recommended), edit the source code near
|
||||
line 5594 to reflect the I/O address you're using, or assign these when
|
||||
loading by:
|
||||
|
||||
insmod de4x5 io=0xghh where g = bus number
|
||||
hh = device number
|
||||
|
||||
NB: autoprobing for modules is now supported by default. You may just
|
||||
use:
|
||||
|
||||
insmod de4x5
|
||||
|
||||
to load all available boards. For a specific board, still use
|
||||
the 'io=?' above.
|
||||
3) compile de4x5.c, but include -DMODULE in the command line to ensure
|
||||
that the correct bits are compiled (see end of source code).
|
||||
4) if you are wanting to add a new card, goto 5. Otherwise, recompile a
|
||||
kernel with the de4x5 configuration turned off and reboot.
|
||||
5) insmod de4x5 [io=0xghh]
|
||||
6) run the net startup bits for your new eth?? interface(s) manually
|
||||
(usually /etc/rc.inet[12] at boot time).
|
||||
7) enjoy!
|
||||
|
||||
To unload a module, turn off the associated interface(s)
|
||||
'ifconfig eth?? down' then 'rmmod de4x5'.
|
||||
|
||||
Automedia detection is included so that in principle you can disconnect
|
||||
from, e.g. TP, reconnect to BNC and things will still work (after a
|
||||
pause whilst the driver figures out where its media went). My tests
|
||||
using ping showed that it appears to work....
|
||||
|
||||
By default, the driver will now autodetect any DECchip based card.
|
||||
Should you have a need to restrict the driver to DIGITAL only cards, you
|
||||
can compile with a DEC_ONLY define, or if loading as a module, use the
|
||||
'dec_only=1' parameter.
|
||||
|
||||
I've changed the timing routines to use the kernel timer and scheduling
|
||||
functions so that the hangs and other assorted problems that occurred
|
||||
while autosensing the media should be gone. A bonus for the DC21040
|
||||
auto media sense algorithm is that it can now use one that is more in
|
||||
line with the rest (the DC21040 chip doesn't have a hardware timer).
|
||||
The downside is the 1 'jiffies' (10ms) resolution.
|
||||
|
||||
IEEE 802.3u MII interface code has been added in anticipation that some
|
||||
products may use it in the future.
|
||||
|
||||
The SMC9332 card has a non-compliant SROM which needs fixing - I have
|
||||
patched this driver to detect it because the SROM format used complies
|
||||
to a previous DEC-STD format.
|
||||
|
||||
I have removed the buffer copies needed for receive on Intels. I cannot
|
||||
remove them for Alphas since the Tulip hardware only does longword
|
||||
aligned DMA transfers and the Alphas get alignment traps with non
|
||||
longword aligned data copies (which makes them really slow). No comment.
|
||||
|
||||
I have added SROM decoding routines to make this driver work with any
|
||||
card that supports the Digital Semiconductor SROM spec. This will help
|
||||
all cards running the dc2114x series chips in particular. Cards using
|
||||
the dc2104x chips should run correctly with the basic driver. I'm in
|
||||
debt to <mjacob@feral.com> for the testing and feedback that helped get
|
||||
this feature working. So far we have tested KINGSTON, SMC8432, SMC9332
|
||||
(with the latest SROM complying with the SROM spec V3: their first was
|
||||
broken), ZNYX342 and LinkSys. ZNYX314 (dual 21041 MAC) and ZNYX 315
|
||||
(quad 21041 MAC) cards also appear to work despite their incorrectly
|
||||
wired IRQs.
|
||||
|
||||
I have added a temporary fix for interrupt problems when some SCSI cards
|
||||
share the same interrupt as the DECchip based cards. The problem occurs
|
||||
because the SCSI card wants to grab the interrupt as a fast interrupt
|
||||
(runs the service routine with interrupts turned off) vs. this card
|
||||
which really needs to run the service routine with interrupts turned on.
|
||||
This driver will now add the interrupt service routine as a fast
|
||||
interrupt if it is bounced from the slow interrupt. THIS IS NOT A
|
||||
RECOMMENDED WAY TO RUN THE DRIVER and has been done for a limited time
|
||||
until people sort out their compatibility issues and the kernel
|
||||
interrupt service code is fixed. YOU SHOULD SEPARATE OUT THE FAST
|
||||
INTERRUPT CARDS FROM THE SLOW INTERRUPT CARDS to ensure that they do not
|
||||
run on the same interrupt. PCMCIA/CardBus is another can of worms...
|
||||
|
||||
Finally, I think I have really fixed the module loading problem with
|
||||
more than one DECchip based card. As a side effect, I don't mess with
|
||||
the device structure any more which means that if more than 1 card in
|
||||
2.0.x is installed (4 in 2.1.x), the user will have to edit
|
||||
linux/drivers/net/Space.c to make room for them. Hence, module loading
|
||||
is the preferred way to use this driver, since it doesn't have this
|
||||
limitation.
|
||||
|
||||
Where SROM media detection is used and full duplex is specified in the
|
||||
SROM, the feature is ignored unless lp->params.fdx is set at compile
|
||||
time OR during a module load (insmod de4x5 args='eth??:fdx' [see
|
||||
below]). This is because there is no way to automatically detect full
|
||||
duplex links except through autonegotiation. When I include the
|
||||
autonegotiation feature in the SROM autoconf code, this detection will
|
||||
occur automatically for that case.
|
||||
|
||||
Command line arguments are now allowed, similar to passing arguments
|
||||
through LILO. This will allow a per adapter board set up of full duplex
|
||||
and media. The only lexical constraints are: the board name (dev->name)
|
||||
appears in the list before its parameters. The list of parameters ends
|
||||
either at the end of the parameter list or with another board name. The
|
||||
following parameters are allowed:
|
||||
|
||||
fdx for full duplex
|
||||
autosense to set the media/speed; with the following
|
||||
sub-parameters:
|
||||
TP, TP_NW, BNC, AUI, BNC_AUI, 100Mb, 10Mb, AUTO
|
||||
|
||||
Case sensitivity is important for the sub-parameters. They *must* be
|
||||
upper case. Examples:
|
||||
|
||||
insmod de4x5 args='eth1:fdx autosense=BNC eth0:autosense=100Mb'.
|
||||
|
||||
For a compiled in driver, in linux/drivers/net/CONFIG, place e.g.
|
||||
DE4X5_OPTS = -DDE4X5_PARM='"eth0:fdx autosense=AUI eth2:autosense=TP"'
|
||||
|
||||
Yes, I know full duplex isn't permissible on BNC or AUI; they're just
|
||||
examples. By default, full duplex is turned off and AUTO is the default
|
||||
autosense setting. In reality, I expect only the full duplex option to
|
||||
be used. Note the use of single quotes in the two examples above and the
|
||||
lack of commas to separate items.
|
||||
232
Documentation/networking/decnet.txt
Normal file
232
Documentation/networking/decnet.txt
Normal file
|
|
@ -0,0 +1,232 @@
|
|||
Linux DECnet Networking Layer Information
|
||||
===========================================
|
||||
|
||||
1) Other documentation....
|
||||
|
||||
o Project Home Pages
|
||||
http://www.chygwyn.com/ - Kernel info
|
||||
http://linux-decnet.sourceforge.net/ - Userland tools
|
||||
http://www.sourceforge.net/projects/linux-decnet/ - Status page
|
||||
|
||||
2) Configuring the kernel
|
||||
|
||||
Be sure to turn on the following options:
|
||||
|
||||
CONFIG_DECNET (obviously)
|
||||
CONFIG_PROC_FS (to see what's going on)
|
||||
CONFIG_SYSCTL (for easy configuration)
|
||||
|
||||
if you want to try out router support (not properly debugged yet)
|
||||
you'll need the following options as well...
|
||||
|
||||
CONFIG_DECNET_ROUTER (to be able to add/delete routes)
|
||||
CONFIG_NETFILTER (will be required for the DECnet routing daemon)
|
||||
|
||||
CONFIG_DECNET_ROUTE_FWMARK is optional
|
||||
|
||||
Don't turn on SIOCGIFCONF support for DECnet unless you are really sure
|
||||
that you need it, in general you won't and it can cause ifconfig to
|
||||
malfunction.
|
||||
|
||||
Run time configuration has changed slightly from the 2.4 system. If you
|
||||
want to configure an endnode, then the simplified procedure is as follows:
|
||||
|
||||
o Set the MAC address on your ethernet card before starting _any_ other
|
||||
network protocols.
|
||||
|
||||
As soon as your network card is brought into the UP state, DECnet should
|
||||
start working. If you need something more complicated or are unsure how
|
||||
to set the MAC address, see the next section. Also all configurations which
|
||||
worked with 2.4 will work under 2.5 with no change.
|
||||
|
||||
3) Command line options
|
||||
|
||||
You can set a DECnet address on the kernel command line for compatibility
|
||||
with the 2.4 configuration procedure, but in general it's not needed any more.
|
||||
If you do st a DECnet address on the command line, it has only one purpose
|
||||
which is that its added to the addresses on the loopback device.
|
||||
|
||||
With 2.4 kernels, DECnet would only recognise addresses as local if they
|
||||
were added to the loopback device. In 2.5, any local interface address
|
||||
can be used to loop back to the local machine. Of course this does not
|
||||
prevent you adding further addresses to the loopback device if you
|
||||
want to.
|
||||
|
||||
N.B. Since the address list of an interface determines the addresses for
|
||||
which "hello" messages are sent, if you don't set an address on the loopback
|
||||
interface then you won't see any entries in /proc/net/neigh for the local
|
||||
host until such time as you start a connection. This doesn't affect the
|
||||
operation of the local communications in any other way though.
|
||||
|
||||
The kernel command line takes options looking like the following:
|
||||
|
||||
decnet.addr=1,2
|
||||
|
||||
the two numbers are the node address 1,2 = 1.2 For 2.2.xx kernels
|
||||
and early 2.3.xx kernels, you must use a comma when specifying the
|
||||
DECnet address like this. For more recent 2.3.xx kernels, you may
|
||||
use almost any character except space, although a `.` would be the most
|
||||
obvious choice :-)
|
||||
|
||||
There used to be a third number specifying the node type. This option
|
||||
has gone away in favour of a per interface node type. This is now set
|
||||
using /proc/sys/net/decnet/conf/<dev>/forwarding. This file can be
|
||||
set with a single digit, 0=EndNode, 1=L1 Router and 2=L2 Router.
|
||||
|
||||
There are also equivalent options for modules. The node address can
|
||||
also be set through the /proc/sys/net/decnet/ files, as can other system
|
||||
parameters.
|
||||
|
||||
Currently the only supported devices are ethernet and ip_gre. The
|
||||
ethernet address of your ethernet card has to be set according to the DECnet
|
||||
address of the node in order for it to be autoconfigured (and then appear in
|
||||
/proc/net/decnet_dev). There is a utility available at the above
|
||||
FTP sites called dn2ethaddr which can compute the correct ethernet
|
||||
address to use. The address can be set by ifconfig either before or
|
||||
at the time the device is brought up. If you are using RedHat you can
|
||||
add the line:
|
||||
|
||||
MACADDR=AA:00:04:00:03:04
|
||||
|
||||
or something similar, to /etc/sysconfig/network-scripts/ifcfg-eth0 or
|
||||
wherever your network card's configuration lives. Setting the MAC address
|
||||
of your ethernet card to an address starting with "hi-ord" will cause a
|
||||
DECnet address which matches to be added to the interface (which you can
|
||||
verify with iproute2).
|
||||
|
||||
The default device for routing can be set through the /proc filesystem
|
||||
by setting /proc/sys/net/decnet/default_device to the
|
||||
device you want DECnet to route packets out of when no specific route
|
||||
is available. Usually this will be eth0, for example:
|
||||
|
||||
echo -n "eth0" >/proc/sys/net/decnet/default_device
|
||||
|
||||
If you don't set the default device, then it will default to the first
|
||||
ethernet card which has been autoconfigured as described above. You can
|
||||
confirm that by looking in the default_device file of course.
|
||||
|
||||
There is a list of what the other files under /proc/sys/net/decnet/ do
|
||||
on the kernel patch web site (shown above).
|
||||
|
||||
4) Run time kernel configuration
|
||||
|
||||
This is either done through the sysctl/proc interface (see the kernel web
|
||||
pages for details on what the various options do) or through the iproute2
|
||||
package in the same way as IPv4/6 configuration is performed.
|
||||
|
||||
Documentation for iproute2 is included with the package, although there is
|
||||
as yet no specific section on DECnet, most of the features apply to both
|
||||
IP and DECnet, albeit with DECnet addresses instead of IP addresses and
|
||||
a reduced functionality.
|
||||
|
||||
If you want to configure a DECnet router you'll need the iproute2 package
|
||||
since its the _only_ way to add and delete routes currently. Eventually
|
||||
there will be a routing daemon to send and receive routing messages for
|
||||
each interface and update the kernel routing tables accordingly. The
|
||||
routing daemon will use netfilter to listen to routing packets, and
|
||||
rtnetlink to update the kernels routing tables.
|
||||
|
||||
The DECnet raw socket layer has been removed since it was there purely
|
||||
for use by the routing daemon which will now use netfilter (a much cleaner
|
||||
and more generic solution) instead.
|
||||
|
||||
5) How can I tell if its working ?
|
||||
|
||||
Here is a quick guide of what to look for in order to know if your DECnet
|
||||
kernel subsystem is working.
|
||||
|
||||
- Is the node address set (see /proc/sys/net/decnet/node_address)
|
||||
- Is the node of the correct type
|
||||
(see /proc/sys/net/decnet/conf/<dev>/forwarding)
|
||||
- Is the Ethernet MAC address of each Ethernet card set to match
|
||||
the DECnet address. If in doubt use the dn2ethaddr utility available
|
||||
at the ftp archive.
|
||||
- If the previous two steps are satisfied, and the Ethernet card is up,
|
||||
you should find that it is listed in /proc/net/decnet_dev and also
|
||||
that it appears as a directory in /proc/sys/net/decnet/conf/. The
|
||||
loopback device (lo) should also appear and is required to communicate
|
||||
within a node.
|
||||
- If you have any DECnet routers on your network, they should appear
|
||||
in /proc/net/decnet_neigh, otherwise this file will only contain the
|
||||
entry for the node itself (if it doesn't check to see if lo is up).
|
||||
- If you want to send to any node which is not listed in the
|
||||
/proc/net/decnet_neigh file, you'll need to set the default device
|
||||
to point to an Ethernet card with connection to a router. This is
|
||||
again done with the /proc/sys/net/decnet/default_device file.
|
||||
- Try starting a simple server and client, like the dnping/dnmirror
|
||||
over the loopback interface. With luck they should communicate.
|
||||
For this step and those after, you'll need the DECnet library
|
||||
which can be obtained from the above ftp sites as well as the
|
||||
actual utilities themselves.
|
||||
- If this seems to work, then try talking to a node on your local
|
||||
network, and see if you can obtain the same results.
|
||||
- At this point you are on your own... :-)
|
||||
|
||||
6) How to send a bug report
|
||||
|
||||
If you've found a bug and want to report it, then there are several things
|
||||
you can do to help me work out exactly what it is that is wrong. Useful
|
||||
information (_most_ of which _is_ _essential_) includes:
|
||||
|
||||
- What kernel version are you running ?
|
||||
- What version of the patch are you running ?
|
||||
- How far though the above set of tests can you get ?
|
||||
- What is in the /proc/decnet* files and /proc/sys/net/decnet/* files ?
|
||||
- Which services are you running ?
|
||||
- Which client caused the problem ?
|
||||
- How much data was being transferred ?
|
||||
- Was the network congested ?
|
||||
- How can the problem be reproduced ?
|
||||
- Can you use tcpdump to get a trace ? (N.B. Most (all?) versions of
|
||||
tcpdump don't understand how to dump DECnet properly, so including
|
||||
the hex listing of the packet contents is _essential_, usually the -x flag.
|
||||
You may also need to increase the length grabbed with the -s flag. The
|
||||
-e flag also provides very useful information (ethernet MAC addresses))
|
||||
|
||||
7) MAC FAQ
|
||||
|
||||
A quick FAQ on ethernet MAC addresses to explain how Linux and DECnet
|
||||
interact and how to get the best performance from your hardware.
|
||||
|
||||
Ethernet cards are designed to normally only pass received network frames
|
||||
to a host computer when they are addressed to it, or to the broadcast address.
|
||||
|
||||
Linux has an interface which allows the setting of extra addresses for
|
||||
an ethernet card to listen to. If the ethernet card supports it, the
|
||||
filtering operation will be done in hardware, if not the extra unwanted packets
|
||||
received will be discarded by the host computer. In the latter case,
|
||||
significant processor time and bus bandwidth can be used up on a busy
|
||||
network (see the NAPI documentation for a longer explanation of these
|
||||
effects).
|
||||
|
||||
DECnet makes use of this interface to allow running DECnet on an ethernet
|
||||
card which has already been configured using TCP/IP (presumably using the
|
||||
built in MAC address of the card, as usual) and/or to allow multiple DECnet
|
||||
addresses on each physical interface. If you do this, be aware that if your
|
||||
ethernet card doesn't support perfect hashing in its MAC address filter
|
||||
then your computer will be doing more work than required. Some cards
|
||||
will simply set themselves into promiscuous mode in order to receive
|
||||
packets from the DECnet specified addresses. So if you have one of these
|
||||
cards its better to set the MAC address of the card as described above
|
||||
to gain the best efficiency. Better still is to use a card which supports
|
||||
NAPI as well.
|
||||
|
||||
|
||||
8) Mailing list
|
||||
|
||||
If you are keen to get involved in development, or want to ask questions
|
||||
about configuration, or even just report bugs, then there is a mailing
|
||||
list that you can join, details are at:
|
||||
|
||||
http://sourceforge.net/mail/?group_id=4993
|
||||
|
||||
9) Legal Info
|
||||
|
||||
The Linux DECnet project team have placed their code under the GPL. The
|
||||
software is provided "as is" and without warranty express or implied.
|
||||
DECnet is a trademark of Compaq. This software is not a product of
|
||||
Compaq. We acknowledge the help of people at Compaq in providing extra
|
||||
documentation above and beyond what was previously publicly available.
|
||||
|
||||
Steve Whitehouse <SteveW@ACM.org>
|
||||
|
||||
282
Documentation/networking/dl2k.txt
Normal file
282
Documentation/networking/dl2k.txt
Normal file
|
|
@ -0,0 +1,282 @@
|
|||
|
||||
D-Link DL2000-based Gigabit Ethernet Adapter Installation
|
||||
for Linux
|
||||
May 23, 2002
|
||||
|
||||
Contents
|
||||
========
|
||||
- Compatibility List
|
||||
- Quick Install
|
||||
- Compiling the Driver
|
||||
- Installing the Driver
|
||||
- Option parameter
|
||||
- Configuration Script Sample
|
||||
- Troubleshooting
|
||||
|
||||
|
||||
Compatibility List
|
||||
=================
|
||||
Adapter Support:
|
||||
|
||||
D-Link DGE-550T Gigabit Ethernet Adapter.
|
||||
D-Link DGE-550SX Gigabit Ethernet Adapter.
|
||||
D-Link DL2000-based Gigabit Ethernet Adapter.
|
||||
|
||||
|
||||
The driver support Linux kernel 2.4.7 later. We had tested it
|
||||
on the environments below.
|
||||
|
||||
. Red Hat v6.2 (update kernel to 2.4.7)
|
||||
. Red Hat v7.0 (update kernel to 2.4.7)
|
||||
. Red Hat v7.1 (kernel 2.4.7)
|
||||
. Red Hat v7.2 (kernel 2.4.7-10)
|
||||
|
||||
|
||||
Quick Install
|
||||
=============
|
||||
Install linux driver as following command:
|
||||
|
||||
1. make all
|
||||
2. insmod dl2k.ko
|
||||
3. ifconfig eth0 up 10.xxx.xxx.xxx netmask 255.0.0.0
|
||||
^^^^^^^^^^^^^^^\ ^^^^^^^^\
|
||||
IP NETMASK
|
||||
Now eth0 should active, you can test it by "ping" or get more information by
|
||||
"ifconfig". If tested ok, continue the next step.
|
||||
|
||||
4. cp dl2k.ko /lib/modules/`uname -r`/kernel/drivers/net
|
||||
5. Add the following line to /etc/modprobe.d/dl2k.conf:
|
||||
alias eth0 dl2k
|
||||
6. Run depmod to updated module indexes.
|
||||
7. Run "netconfig" or "netconf" to create configuration script ifcfg-eth0
|
||||
located at /etc/sysconfig/network-scripts or create it manually.
|
||||
[see - Configuration Script Sample]
|
||||
8. Driver will automatically load and configure at next boot time.
|
||||
|
||||
Compiling the Driver
|
||||
====================
|
||||
In Linux, NIC drivers are most commonly configured as loadable modules.
|
||||
The approach of building a monolithic kernel has become obsolete. The driver
|
||||
can be compiled as part of a monolithic kernel, but is strongly discouraged.
|
||||
The remainder of this section assumes the driver is built as a loadable module.
|
||||
In the Linux environment, it is a good idea to rebuild the driver from the
|
||||
source instead of relying on a precompiled version. This approach provides
|
||||
better reliability since a precompiled driver might depend on libraries or
|
||||
kernel features that are not present in a given Linux installation.
|
||||
|
||||
The 3 files necessary to build Linux device driver are dl2k.c, dl2k.h and
|
||||
Makefile. To compile, the Linux installation must include the gcc compiler,
|
||||
the kernel source, and the kernel headers. The Linux driver supports Linux
|
||||
Kernels 2.4.7. Copy the files to a directory and enter the following command
|
||||
to compile and link the driver:
|
||||
|
||||
CD-ROM drive
|
||||
------------
|
||||
|
||||
[root@XXX /] mkdir cdrom
|
||||
[root@XXX /] mount -r -t iso9660 -o conv=auto /dev/cdrom /cdrom
|
||||
[root@XXX /] cd root
|
||||
[root@XXX /root] mkdir dl2k
|
||||
[root@XXX /root] cd dl2k
|
||||
[root@XXX dl2k] cp /cdrom/linux/dl2k.tgz /root/dl2k
|
||||
[root@XXX dl2k] tar xfvz dl2k.tgz
|
||||
[root@XXX dl2k] make all
|
||||
|
||||
Floppy disc drive
|
||||
-----------------
|
||||
|
||||
[root@XXX /] cd root
|
||||
[root@XXX /root] mkdir dl2k
|
||||
[root@XXX /root] cd dl2k
|
||||
[root@XXX dl2k] mcopy a:/linux/dl2k.tgz /root/dl2k
|
||||
[root@XXX dl2k] tar xfvz dl2k.tgz
|
||||
[root@XXX dl2k] make all
|
||||
|
||||
Installing the Driver
|
||||
=====================
|
||||
|
||||
Manual Installation
|
||||
-------------------
|
||||
Once the driver has been compiled, it must be loaded, enabled, and bound
|
||||
to a protocol stack in order to establish network connectivity. To load a
|
||||
module enter the command:
|
||||
|
||||
insmod dl2k.o
|
||||
|
||||
or
|
||||
|
||||
insmod dl2k.o <optional parameter> ; add parameter
|
||||
|
||||
===============================================================
|
||||
example: insmod dl2k.o media=100mbps_hd
|
||||
or insmod dl2k.o media=3
|
||||
or insmod dl2k.o media=3,2 ; for 2 cards
|
||||
===============================================================
|
||||
|
||||
Please reference the list of the command line parameters supported by
|
||||
the Linux device driver below.
|
||||
|
||||
The insmod command only loads the driver and gives it a name of the form
|
||||
eth0, eth1, etc. To bring the NIC into an operational state,
|
||||
it is necessary to issue the following command:
|
||||
|
||||
ifconfig eth0 up
|
||||
|
||||
Finally, to bind the driver to the active protocol (e.g., TCP/IP with
|
||||
Linux), enter the following command:
|
||||
|
||||
ifup eth0
|
||||
|
||||
Note that this is meaningful only if the system can find a configuration
|
||||
script that contains the necessary network information. A sample will be
|
||||
given in the next paragraph.
|
||||
|
||||
The commands to unload a driver are as follows:
|
||||
|
||||
ifdown eth0
|
||||
ifconfig eth0 down
|
||||
rmmod dl2k.o
|
||||
|
||||
The following are the commands to list the currently loaded modules and
|
||||
to see the current network configuration.
|
||||
|
||||
lsmod
|
||||
ifconfig
|
||||
|
||||
|
||||
Automated Installation
|
||||
----------------------
|
||||
This section describes how to install the driver such that it is
|
||||
automatically loaded and configured at boot time. The following description
|
||||
is based on a Red Hat 6.0/7.0 distribution, but it can easily be ported to
|
||||
other distributions as well.
|
||||
|
||||
Red Hat v6.x/v7.x
|
||||
-----------------
|
||||
1. Copy dl2k.o to the network modules directory, typically
|
||||
/lib/modules/2.x.x-xx/net or /lib/modules/2.x.x/kernel/drivers/net.
|
||||
2. Locate the boot module configuration file, most commonly in the
|
||||
/etc/modprobe.d/ directory. Add the following lines:
|
||||
|
||||
alias ethx dl2k
|
||||
options dl2k <optional parameters>
|
||||
|
||||
where ethx will be eth0 if the NIC is the only ethernet adapter, eth1 if
|
||||
one other ethernet adapter is installed, etc. Refer to the table in the
|
||||
previous section for the list of optional parameters.
|
||||
3. Locate the network configuration scripts, normally the
|
||||
/etc/sysconfig/network-scripts directory, and create a configuration
|
||||
script named ifcfg-ethx that contains network information.
|
||||
4. Note that for most Linux distributions, Red Hat included, a configuration
|
||||
utility with a graphical user interface is provided to perform steps 2
|
||||
and 3 above.
|
||||
|
||||
|
||||
Parameter Description
|
||||
=====================
|
||||
You can install this driver without any additional parameter. However, if you
|
||||
are going to have extensive functions then it is necessary to set extra
|
||||
parameter. Below is a list of the command line parameters supported by the
|
||||
Linux device
|
||||
driver.
|
||||
|
||||
mtu=packet_size - Specifies the maximum packet size. default
|
||||
is 1500.
|
||||
|
||||
media=media_type - Specifies the media type the NIC operates at.
|
||||
autosense Autosensing active media.
|
||||
10mbps_hd 10Mbps half duplex.
|
||||
10mbps_fd 10Mbps full duplex.
|
||||
100mbps_hd 100Mbps half duplex.
|
||||
100mbps_fd 100Mbps full duplex.
|
||||
1000mbps_fd 1000Mbps full duplex.
|
||||
1000mbps_hd 1000Mbps half duplex.
|
||||
0 Autosensing active media.
|
||||
1 10Mbps half duplex.
|
||||
2 10Mbps full duplex.
|
||||
3 100Mbps half duplex.
|
||||
4 100Mbps full duplex.
|
||||
5 1000Mbps half duplex.
|
||||
6 1000Mbps full duplex.
|
||||
|
||||
By default, the NIC operates at autosense.
|
||||
1000mbps_fd and 1000mbps_hd types are only
|
||||
available for fiber adapter.
|
||||
|
||||
vlan=n - Specifies the VLAN ID. If vlan=0, the
|
||||
Virtual Local Area Network (VLAN) function is
|
||||
disable.
|
||||
|
||||
jumbo=[0|1] - Specifies the jumbo frame support. If jumbo=1,
|
||||
the NIC accept jumbo frames. By default, this
|
||||
function is disabled.
|
||||
Jumbo frame usually improve the performance
|
||||
int gigabit.
|
||||
This feature need jumbo frame compatible
|
||||
remote.
|
||||
|
||||
rx_coalesce=m - Number of rx frame handled each interrupt.
|
||||
rx_timeout=n - Rx DMA wait time for an interrupt.
|
||||
If set rx_coalesce > 0, hardware only assert
|
||||
an interrupt for m frames. Hardware won't
|
||||
assert rx interrupt until m frames received or
|
||||
reach timeout of n * 640 nano seconds.
|
||||
Set proper rx_coalesce and rx_timeout can
|
||||
reduce congestion collapse and overload which
|
||||
has been a bottleneck for high speed network.
|
||||
|
||||
For example, rx_coalesce=10 rx_timeout=800.
|
||||
that is, hardware assert only 1 interrupt
|
||||
for 10 frames received or timeout of 512 us.
|
||||
|
||||
tx_coalesce=n - Number of tx frame handled each interrupt.
|
||||
Set n > 1 can reduce the interrupts
|
||||
congestion usually lower performance of
|
||||
high speed network card. Default is 16.
|
||||
|
||||
tx_flow=[1|0] - Specifies the Tx flow control. If tx_flow=0,
|
||||
the Tx flow control disable else driver
|
||||
autodetect.
|
||||
rx_flow=[1|0] - Specifies the Rx flow control. If rx_flow=0,
|
||||
the Rx flow control enable else driver
|
||||
autodetect.
|
||||
|
||||
|
||||
Configuration Script Sample
|
||||
===========================
|
||||
Here is a sample of a simple configuration script:
|
||||
|
||||
DEVICE=eth0
|
||||
USERCTL=no
|
||||
ONBOOT=yes
|
||||
POOTPROTO=none
|
||||
BROADCAST=207.200.5.255
|
||||
NETWORK=207.200.5.0
|
||||
NETMASK=255.255.255.0
|
||||
IPADDR=207.200.5.2
|
||||
|
||||
|
||||
Troubleshooting
|
||||
===============
|
||||
Q1. Source files contain ^ M behind every line.
|
||||
Make sure all files are Unix file format (no LF). Try the following
|
||||
shell command to convert files.
|
||||
|
||||
cat dl2k.c | col -b > dl2k.tmp
|
||||
mv dl2k.tmp dl2k.c
|
||||
|
||||
OR
|
||||
|
||||
cat dl2k.c | tr -d "\r" > dl2k.tmp
|
||||
mv dl2k.tmp dl2k.c
|
||||
|
||||
Q2: Could not find header files (*.h) ?
|
||||
To compile the driver, you need kernel header files. After
|
||||
installing the kernel source, the header files are usually located in
|
||||
/usr/src/linux/include, which is the default include directory configured
|
||||
in Makefile. For some distributions, there is a copy of header files in
|
||||
/usr/src/include/linux and /usr/src/include/asm, that you can change the
|
||||
INCLUDEDIR in Makefile to /usr/include without installing kernel source.
|
||||
Note that RH 7.0 didn't provide correct header files in /usr/include,
|
||||
including those files will make a wrong version driver.
|
||||
|
||||
167
Documentation/networking/dm9000.txt
Normal file
167
Documentation/networking/dm9000.txt
Normal file
|
|
@ -0,0 +1,167 @@
|
|||
DM9000 Network driver
|
||||
=====================
|
||||
|
||||
Copyright 2008 Simtec Electronics,
|
||||
Ben Dooks <ben@simtec.co.uk> <ben-linux@fluff.org>
|
||||
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
This file describes how to use the DM9000 platform-device based network driver
|
||||
that is contained in the files drivers/net/dm9000.c and drivers/net/dm9000.h.
|
||||
|
||||
The driver supports three DM9000 variants, the DM9000E which is the first chip
|
||||
supported as well as the newer DM9000A and DM9000B devices. It is currently
|
||||
maintained and tested by Ben Dooks, who should be CC: to any patches for this
|
||||
driver.
|
||||
|
||||
|
||||
Defining the platform device
|
||||
----------------------------
|
||||
|
||||
The minimum set of resources attached to the platform device are as follows:
|
||||
|
||||
1) The physical address of the address register
|
||||
2) The physical address of the data register
|
||||
3) The IRQ line the device's interrupt pin is connected to.
|
||||
|
||||
These resources should be specified in that order, as the ordering of the
|
||||
two address regions is important (the driver expects these to be address
|
||||
and then data).
|
||||
|
||||
An example from arch/arm/mach-s3c2410/mach-bast.c is:
|
||||
|
||||
static struct resource bast_dm9k_resource[] = {
|
||||
[0] = {
|
||||
.start = S3C2410_CS5 + BAST_PA_DM9000,
|
||||
.end = S3C2410_CS5 + BAST_PA_DM9000 + 3,
|
||||
.flags = IORESOURCE_MEM,
|
||||
},
|
||||
[1] = {
|
||||
.start = S3C2410_CS5 + BAST_PA_DM9000 + 0x40,
|
||||
.end = S3C2410_CS5 + BAST_PA_DM9000 + 0x40 + 0x3f,
|
||||
.flags = IORESOURCE_MEM,
|
||||
},
|
||||
[2] = {
|
||||
.start = IRQ_DM9000,
|
||||
.end = IRQ_DM9000,
|
||||
.flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHLEVEL,
|
||||
}
|
||||
};
|
||||
|
||||
static struct platform_device bast_device_dm9k = {
|
||||
.name = "dm9000",
|
||||
.id = 0,
|
||||
.num_resources = ARRAY_SIZE(bast_dm9k_resource),
|
||||
.resource = bast_dm9k_resource,
|
||||
};
|
||||
|
||||
Note the setting of the IRQ trigger flag in bast_dm9k_resource[2].flags,
|
||||
as this will generate a warning if it is not present. The trigger from
|
||||
the flags field will be passed to request_irq() when registering the IRQ
|
||||
handler to ensure that the IRQ is setup correctly.
|
||||
|
||||
This shows a typical platform device, without the optional configuration
|
||||
platform data supplied. The next example uses the same resources, but adds
|
||||
the optional platform data to pass extra configuration data:
|
||||
|
||||
static struct dm9000_plat_data bast_dm9k_platdata = {
|
||||
.flags = DM9000_PLATF_16BITONLY,
|
||||
};
|
||||
|
||||
static struct platform_device bast_device_dm9k = {
|
||||
.name = "dm9000",
|
||||
.id = 0,
|
||||
.num_resources = ARRAY_SIZE(bast_dm9k_resource),
|
||||
.resource = bast_dm9k_resource,
|
||||
.dev = {
|
||||
.platform_data = &bast_dm9k_platdata,
|
||||
}
|
||||
};
|
||||
|
||||
The platform data is defined in include/linux/dm9000.h and described below.
|
||||
|
||||
|
||||
Platform data
|
||||
-------------
|
||||
|
||||
Extra platform data for the DM9000 can describe the IO bus width to the
|
||||
device, whether or not an external PHY is attached to the device and
|
||||
the availability of an external configuration EEPROM.
|
||||
|
||||
The flags for the platform data .flags field are as follows:
|
||||
|
||||
DM9000_PLATF_8BITONLY
|
||||
|
||||
The IO should be done with 8bit operations.
|
||||
|
||||
DM9000_PLATF_16BITONLY
|
||||
|
||||
The IO should be done with 16bit operations.
|
||||
|
||||
DM9000_PLATF_32BITONLY
|
||||
|
||||
The IO should be done with 32bit operations.
|
||||
|
||||
DM9000_PLATF_EXT_PHY
|
||||
|
||||
The chip is connected to an external PHY.
|
||||
|
||||
DM9000_PLATF_NO_EEPROM
|
||||
|
||||
This can be used to signify that the board does not have an
|
||||
EEPROM, or that the EEPROM should be hidden from the user.
|
||||
|
||||
DM9000_PLATF_SIMPLE_PHY
|
||||
|
||||
Switch to using the simpler PHY polling method which does not
|
||||
try and read the MII PHY state regularly. This is only available
|
||||
when using the internal PHY. See the section on link state polling
|
||||
for more information.
|
||||
|
||||
The config symbol DM9000_FORCE_SIMPLE_PHY_POLL, Kconfig entry
|
||||
"Force simple NSR based PHY polling" allows this flag to be
|
||||
forced on at build time.
|
||||
|
||||
|
||||
PHY Link state polling
|
||||
----------------------
|
||||
|
||||
The driver keeps track of the link state and informs the network core
|
||||
about link (carrier) availability. This is managed by several methods
|
||||
depending on the version of the chip and on which PHY is being used.
|
||||
|
||||
For the internal PHY, the original (and currently default) method is
|
||||
to read the MII state, either when the status changes if we have the
|
||||
necessary interrupt support in the chip or every two seconds via a
|
||||
periodic timer.
|
||||
|
||||
To reduce the overhead for the internal PHY, there is now the option
|
||||
of using the DM9000_FORCE_SIMPLE_PHY_POLL config, or DM9000_PLATF_SIMPLE_PHY
|
||||
platform data option to read the summary information without the
|
||||
expensive MII accesses. This method is faster, but does not print
|
||||
as much information.
|
||||
|
||||
When using an external PHY, the driver currently has to poll the MII
|
||||
link status as there is no method for getting an interrupt on link change.
|
||||
|
||||
|
||||
DM9000A / DM9000B
|
||||
-----------------
|
||||
|
||||
These chips are functionally similar to the DM9000E and are supported easily
|
||||
by the same driver. The features are:
|
||||
|
||||
1) Interrupt on internal PHY state change. This means that the periodic
|
||||
polling of the PHY status may be disabled on these devices when using
|
||||
the internal PHY.
|
||||
|
||||
2) TCP/UDP checksum offloading, which the driver does not currently support.
|
||||
|
||||
|
||||
ethtool
|
||||
-------
|
||||
|
||||
The driver supports the ethtool interface for access to the driver
|
||||
state information, the PHY state and the EEPROM.
|
||||
66
Documentation/networking/dmfe.txt
Normal file
66
Documentation/networking/dmfe.txt
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
Note: This driver doesn't have a maintainer.
|
||||
|
||||
Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux.
|
||||
|
||||
This program is free software; you can redistribute it and/or
|
||||
modify it under the terms of the GNU General Public License
|
||||
as published by the Free Software Foundation; either version 2
|
||||
of the License, or (at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
|
||||
This driver provides kernel support for Davicom DM9102(A)/DM9132/DM9801 ethernet cards ( CNET
|
||||
10/100 ethernet cards uses Davicom chipset too, so this driver supports CNET cards too ).If you
|
||||
didn't compile this driver as a module, it will automatically load itself on boot and print a
|
||||
line similar to :
|
||||
|
||||
dmfe: Davicom DM9xxx net driver, version 1.36.4 (2002-01-17)
|
||||
|
||||
If you compiled this driver as a module, you have to load it on boot.You can load it with command :
|
||||
|
||||
insmod dmfe
|
||||
|
||||
This way it will autodetect the device mode.This is the suggested way to load the module.Or you can pass
|
||||
a mode= setting to module while loading, like :
|
||||
|
||||
insmod dmfe mode=0 # Force 10M Half Duplex
|
||||
insmod dmfe mode=1 # Force 100M Half Duplex
|
||||
insmod dmfe mode=4 # Force 10M Full Duplex
|
||||
insmod dmfe mode=5 # Force 100M Full Duplex
|
||||
|
||||
Next you should configure your network interface with a command similar to :
|
||||
|
||||
ifconfig eth0 172.22.3.18
|
||||
^^^^^^^^^^^
|
||||
Your IP Address
|
||||
|
||||
Then you may have to modify the default routing table with command :
|
||||
|
||||
route add default eth0
|
||||
|
||||
|
||||
Now your ethernet card should be up and running.
|
||||
|
||||
|
||||
TODO:
|
||||
|
||||
Implement pci_driver::suspend() and pci_driver::resume() power management methods.
|
||||
Check on 64 bit boxes.
|
||||
Check and fix on big endian boxes.
|
||||
Test and make sure PCI latency is now correct for all cases.
|
||||
|
||||
|
||||
Authors:
|
||||
|
||||
Sten Wang <sten_wang@davicom.com.tw > : Original Author
|
||||
|
||||
Contributors:
|
||||
|
||||
Marcelo Tosatti <marcelo@conectiva.com.br>
|
||||
Alan Cox <alan@lxorguk.ukuu.org.uk>
|
||||
Jeff Garzik <jgarzik@pobox.com>
|
||||
Vojtech Pavlik <vojtech@suse.cz>
|
||||
157
Documentation/networking/dns_resolver.txt
Normal file
157
Documentation/networking/dns_resolver.txt
Normal file
|
|
@ -0,0 +1,157 @@
|
|||
===================
|
||||
DNS Resolver Module
|
||||
===================
|
||||
|
||||
Contents:
|
||||
|
||||
- Overview.
|
||||
- Compilation.
|
||||
- Setting up.
|
||||
- Usage.
|
||||
- Mechanism.
|
||||
- Debugging.
|
||||
|
||||
|
||||
========
|
||||
OVERVIEW
|
||||
========
|
||||
|
||||
The DNS resolver module provides a way for kernel services to make DNS queries
|
||||
by way of requesting a key of key type dns_resolver. These queries are
|
||||
upcalled to userspace through /sbin/request-key.
|
||||
|
||||
These routines must be supported by userspace tools dns.upcall, cifs.upcall and
|
||||
request-key. It is under development and does not yet provide the full feature
|
||||
set. The features it does support include:
|
||||
|
||||
(*) Implements the dns_resolver key_type to contact userspace.
|
||||
|
||||
It does not yet support the following AFS features:
|
||||
|
||||
(*) Dns query support for AFSDB resource record.
|
||||
|
||||
This code is extracted from the CIFS filesystem.
|
||||
|
||||
|
||||
===========
|
||||
COMPILATION
|
||||
===========
|
||||
|
||||
The module should be enabled by turning on the kernel configuration options:
|
||||
|
||||
CONFIG_DNS_RESOLVER - tristate "DNS Resolver support"
|
||||
|
||||
|
||||
==========
|
||||
SETTING UP
|
||||
==========
|
||||
|
||||
To set up this facility, the /etc/request-key.conf file must be altered so that
|
||||
/sbin/request-key can appropriately direct the upcalls. For example, to handle
|
||||
basic dname to IPv4/IPv6 address resolution, the following line should be
|
||||
added:
|
||||
|
||||
#OP TYPE DESC CO-INFO PROGRAM ARG1 ARG2 ARG3 ...
|
||||
#====== ============ ======= ======= ==========================
|
||||
create dns_resolver * * /usr/sbin/cifs.upcall %k
|
||||
|
||||
To direct a query for query type 'foo', a line of the following should be added
|
||||
before the more general line given above as the first match is the one taken.
|
||||
|
||||
create dns_resolver foo:* * /usr/sbin/dns.foo %k
|
||||
|
||||
|
||||
=====
|
||||
USAGE
|
||||
=====
|
||||
|
||||
To make use of this facility, one of the following functions that are
|
||||
implemented in the module can be called after doing:
|
||||
|
||||
#include <linux/dns_resolver.h>
|
||||
|
||||
(1) int dns_query(const char *type, const char *name, size_t namelen,
|
||||
const char *options, char **_result, time_t *_expiry);
|
||||
|
||||
This is the basic access function. It looks for a cached DNS query and if
|
||||
it doesn't find it, it upcalls to userspace to make a new DNS query, which
|
||||
may then be cached. The key description is constructed as a string of the
|
||||
form:
|
||||
|
||||
[<type>:]<name>
|
||||
|
||||
where <type> optionally specifies the particular upcall program to invoke,
|
||||
and thus the type of query to do, and <name> specifies the string to be
|
||||
looked up. The default query type is a straight hostname to IP address
|
||||
set lookup.
|
||||
|
||||
The name parameter is not required to be a NUL-terminated string, and its
|
||||
length should be given by the namelen argument.
|
||||
|
||||
The options parameter may be NULL or it may be a set of options
|
||||
appropriate to the query type.
|
||||
|
||||
The return value is a string appropriate to the query type. For instance,
|
||||
for the default query type it is just a list of comma-separated IPv4 and
|
||||
IPv6 addresses. The caller must free the result.
|
||||
|
||||
The length of the result string is returned on success, and a negative
|
||||
error code is returned otherwise. -EKEYREJECTED will be returned if the
|
||||
DNS lookup failed.
|
||||
|
||||
If _expiry is non-NULL, the expiry time (TTL) of the result will be
|
||||
returned also.
|
||||
|
||||
The kernel maintains an internal keyring in which it caches looked up keys.
|
||||
This can be cleared by any process that has the CAP_SYS_ADMIN capability by
|
||||
the use of KEYCTL_KEYRING_CLEAR on the keyring ID.
|
||||
|
||||
|
||||
===============================
|
||||
READING DNS KEYS FROM USERSPACE
|
||||
===============================
|
||||
|
||||
Keys of dns_resolver type can be read from userspace using keyctl_read() or
|
||||
"keyctl read/print/pipe".
|
||||
|
||||
|
||||
=========
|
||||
MECHANISM
|
||||
=========
|
||||
|
||||
The dnsresolver module registers a key type called "dns_resolver". Keys of
|
||||
this type are used to transport and cache DNS lookup results from userspace.
|
||||
|
||||
When dns_query() is invoked, it calls request_key() to search the local
|
||||
keyrings for a cached DNS result. If that fails to find one, it upcalls to
|
||||
userspace to get a new result.
|
||||
|
||||
Upcalls to userspace are made through the request_key() upcall vector, and are
|
||||
directed by means of configuration lines in /etc/request-key.conf that tell
|
||||
/sbin/request-key what program to run to instantiate the key.
|
||||
|
||||
The upcall handler program is responsible for querying the DNS, processing the
|
||||
result into a form suitable for passing to the keyctl_instantiate_key()
|
||||
routine. This then passes the data to dns_resolver_instantiate() which strips
|
||||
off and processes any options included in the data, and then attaches the
|
||||
remainder of the string to the key as its payload.
|
||||
|
||||
The upcall handler program should set the expiry time on the key to that of the
|
||||
lowest TTL of all the records it has extracted a result from. This means that
|
||||
the key will be discarded and recreated when the data it holds has expired.
|
||||
|
||||
dns_query() returns a copy of the value attached to the key, or an error if
|
||||
that is indicated instead.
|
||||
|
||||
See <file:Documentation/security/keys-request-key.txt> for further
|
||||
information about request-key function.
|
||||
|
||||
|
||||
=========
|
||||
DEBUGGING
|
||||
=========
|
||||
|
||||
Debugging messages can be turned on dynamically by writing a 1 into the
|
||||
following file:
|
||||
|
||||
/sys/module/dnsresolver/parameters/debug
|
||||
93
Documentation/networking/driver.txt
Normal file
93
Documentation/networking/driver.txt
Normal file
|
|
@ -0,0 +1,93 @@
|
|||
Document about softnet driver issues
|
||||
|
||||
Transmit path guidelines:
|
||||
|
||||
1) The ndo_start_xmit method must not return NETDEV_TX_BUSY under
|
||||
any normal circumstances. It is considered a hard error unless
|
||||
there is no way your device can tell ahead of time when it's
|
||||
transmit function will become busy.
|
||||
|
||||
Instead it must maintain the queue properly. For example,
|
||||
for a driver implementing scatter-gather this means:
|
||||
|
||||
static netdev_tx_t drv_hard_start_xmit(struct sk_buff *skb,
|
||||
struct net_device *dev)
|
||||
{
|
||||
struct drv *dp = netdev_priv(dev);
|
||||
|
||||
lock_tx(dp);
|
||||
...
|
||||
/* This is a hard error log it. */
|
||||
if (TX_BUFFS_AVAIL(dp) <= (skb_shinfo(skb)->nr_frags + 1)) {
|
||||
netif_stop_queue(dev);
|
||||
unlock_tx(dp);
|
||||
printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n",
|
||||
dev->name);
|
||||
return NETDEV_TX_BUSY;
|
||||
}
|
||||
|
||||
... queue packet to card ...
|
||||
... update tx consumer index ...
|
||||
|
||||
if (TX_BUFFS_AVAIL(dp) <= (MAX_SKB_FRAGS + 1))
|
||||
netif_stop_queue(dev);
|
||||
|
||||
...
|
||||
unlock_tx(dp);
|
||||
...
|
||||
return NETDEV_TX_OK;
|
||||
}
|
||||
|
||||
And then at the end of your TX reclamation event handling:
|
||||
|
||||
if (netif_queue_stopped(dp->dev) &&
|
||||
TX_BUFFS_AVAIL(dp) > (MAX_SKB_FRAGS + 1))
|
||||
netif_wake_queue(dp->dev);
|
||||
|
||||
For a non-scatter-gather supporting card, the three tests simply become:
|
||||
|
||||
/* This is a hard error log it. */
|
||||
if (TX_BUFFS_AVAIL(dp) <= 0)
|
||||
|
||||
and:
|
||||
|
||||
if (TX_BUFFS_AVAIL(dp) == 0)
|
||||
|
||||
and:
|
||||
|
||||
if (netif_queue_stopped(dp->dev) &&
|
||||
TX_BUFFS_AVAIL(dp) > 0)
|
||||
netif_wake_queue(dp->dev);
|
||||
|
||||
2) An ndo_start_xmit method must not modify the shared parts of a
|
||||
cloned SKB.
|
||||
|
||||
3) Do not forget that once you return NETDEV_TX_OK from your
|
||||
ndo_start_xmit method, it is your driver's responsibility to free
|
||||
up the SKB and in some finite amount of time.
|
||||
|
||||
For example, this means that it is not allowed for your TX
|
||||
mitigation scheme to let TX packets "hang out" in the TX
|
||||
ring unreclaimed forever if no new TX packets are sent.
|
||||
This error can deadlock sockets waiting for send buffer room
|
||||
to be freed up.
|
||||
|
||||
If you return NETDEV_TX_BUSY from the ndo_start_xmit method, you
|
||||
must not keep any reference to that SKB and you must not attempt
|
||||
to free it up.
|
||||
|
||||
Probing guidelines:
|
||||
|
||||
1) Any hardware layer address you obtain for your device should
|
||||
be verified. For example, for ethernet check it with
|
||||
linux/etherdevice.h:is_valid_ether_addr()
|
||||
|
||||
Close/stop guidelines:
|
||||
|
||||
1) After the ndo_stop routine has been called, the hardware must
|
||||
not receive or transmit any data. All in flight packets must
|
||||
be aborted. If necessary, poll or wait for completion of
|
||||
any reset commands.
|
||||
|
||||
2) The ndo_stop routine will be called by unregister_netdevice
|
||||
if device is still UP.
|
||||
197
Documentation/networking/e100.txt
Normal file
197
Documentation/networking/e100.txt
Normal file
|
|
@ -0,0 +1,197 @@
|
|||
Linux* Base Driver for the Intel(R) PRO/100 Family of Adapters
|
||||
==============================================================
|
||||
|
||||
March 15, 2011
|
||||
|
||||
Contents
|
||||
========
|
||||
|
||||
- In This Release
|
||||
- Identifying Your Adapter
|
||||
- Building and Installation
|
||||
- Driver Configuration Parameters
|
||||
- Additional Configurations
|
||||
- Known Issues
|
||||
- Support
|
||||
|
||||
|
||||
In This Release
|
||||
===============
|
||||
|
||||
This file describes the Linux* Base Driver for the Intel(R) PRO/100 Family of
|
||||
Adapters. This driver includes support for Itanium(R)2-based systems.
|
||||
|
||||
For questions related to hardware requirements, refer to the documentation
|
||||
supplied with your Intel PRO/100 adapter.
|
||||
|
||||
The following features are now available in supported kernels:
|
||||
- Native VLANs
|
||||
- Channel Bonding (teaming)
|
||||
- SNMP
|
||||
|
||||
Channel Bonding documentation can be found in the Linux kernel source:
|
||||
/Documentation/networking/bonding.txt
|
||||
|
||||
|
||||
Identifying Your Adapter
|
||||
========================
|
||||
|
||||
For more information on how to identify your adapter, go to the Adapter &
|
||||
Driver ID Guide at:
|
||||
|
||||
http://support.intel.com/support/network/adapter/pro100/21397.htm
|
||||
|
||||
For the latest Intel network drivers for Linux, refer to the following
|
||||
website. In the search field, enter your adapter name or type, or use the
|
||||
networking link on the left to search for your adapter:
|
||||
|
||||
http://downloadfinder.intel.com/scripts-df/support_intel.asp
|
||||
|
||||
Driver Configuration Parameters
|
||||
===============================
|
||||
|
||||
The default value for each parameter is generally the recommended setting,
|
||||
unless otherwise noted.
|
||||
|
||||
Rx Descriptors: Number of receive descriptors. A receive descriptor is a data
|
||||
structure that describes a receive buffer and its attributes to the network
|
||||
controller. The data in the descriptor is used by the controller to write
|
||||
data from the controller to host memory. In the 3.x.x driver the valid range
|
||||
for this parameter is 64-256. The default value is 64. This parameter can be
|
||||
changed using the command:
|
||||
|
||||
ethtool -G eth? rx n, where n is the number of desired rx descriptors.
|
||||
|
||||
Tx Descriptors: Number of transmit descriptors. A transmit descriptor is a data
|
||||
structure that describes a transmit buffer and its attributes to the network
|
||||
controller. The data in the descriptor is used by the controller to read
|
||||
data from the host memory to the controller. In the 3.x.x driver the valid
|
||||
range for this parameter is 64-256. The default value is 64. This parameter
|
||||
can be changed using the command:
|
||||
|
||||
ethtool -G eth? tx n, where n is the number of desired tx descriptors.
|
||||
|
||||
Speed/Duplex: The driver auto-negotiates the link speed and duplex settings by
|
||||
default. The ethtool utility can be used as follows to force speed/duplex.
|
||||
|
||||
ethtool -s eth? autoneg off speed {10|100} duplex {full|half}
|
||||
|
||||
NOTE: setting the speed/duplex to incorrect values will cause the link to
|
||||
fail.
|
||||
|
||||
Event Log Message Level: The driver uses the message level flag to log events
|
||||
to syslog. The message level can be set at driver load time. It can also be
|
||||
set using the command:
|
||||
|
||||
ethtool -s eth? msglvl n
|
||||
|
||||
|
||||
Additional Configurations
|
||||
=========================
|
||||
|
||||
Configuring the Driver on Different Distributions
|
||||
-------------------------------------------------
|
||||
|
||||
Configuring a network driver to load properly when the system is started is
|
||||
distribution dependent. Typically, the configuration process involves adding
|
||||
an alias line to /etc/modprobe.d/*.conf as well as editing other system
|
||||
startup scripts and/or configuration files. Many popular Linux
|
||||
distributions ship with tools to make these changes for you. To learn the
|
||||
proper way to configure a network device for your system, refer to your
|
||||
distribution documentation. If during this process you are asked for the
|
||||
driver or module name, the name for the Linux Base Driver for the Intel
|
||||
PRO/100 Family of Adapters is e100.
|
||||
|
||||
As an example, if you install the e100 driver for two PRO/100 adapters
|
||||
(eth0 and eth1), add the following to a configuration file in /etc/modprobe.d/
|
||||
|
||||
alias eth0 e100
|
||||
alias eth1 e100
|
||||
|
||||
Viewing Link Messages
|
||||
---------------------
|
||||
In order to see link messages and other Intel driver information on your
|
||||
console, you must set the dmesg level up to six. This can be done by
|
||||
entering the following on the command line before loading the e100 driver:
|
||||
|
||||
dmesg -n 8
|
||||
|
||||
If you wish to see all messages issued by the driver, including debug
|
||||
messages, set the dmesg level to eight.
|
||||
|
||||
NOTE: This setting is not saved across reboots.
|
||||
|
||||
|
||||
ethtool
|
||||
-------
|
||||
|
||||
The driver utilizes the ethtool interface for driver configuration and
|
||||
diagnostics, as well as displaying statistical information. The ethtool
|
||||
version 1.6 or later is required for this functionality.
|
||||
|
||||
The latest release of ethtool can be found from
|
||||
http://ftp.kernel.org/pub/software/network/ethtool/
|
||||
|
||||
Enabling Wake on LAN* (WoL)
|
||||
---------------------------
|
||||
WoL is provided through the ethtool* utility. For instructions on enabling
|
||||
WoL with ethtool, refer to the ethtool man page.
|
||||
|
||||
WoL will be enabled on the system during the next shut down or reboot. For
|
||||
this driver version, in order to enable WoL, the e100 driver must be
|
||||
loaded when shutting down or rebooting the system.
|
||||
|
||||
NAPI
|
||||
----
|
||||
|
||||
NAPI (Rx polling mode) is supported in the e100 driver.
|
||||
|
||||
See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI.
|
||||
|
||||
Multiple Interfaces on Same Ethernet Broadcast Network
|
||||
------------------------------------------------------
|
||||
|
||||
Due to the default ARP behavior on Linux, it is not possible to have
|
||||
one system on two IP networks in the same Ethernet broadcast domain
|
||||
(non-partitioned switch) behave as expected. All Ethernet interfaces
|
||||
will respond to IP traffic for any IP address assigned to the system.
|
||||
This results in unbalanced receive traffic.
|
||||
|
||||
If you have multiple interfaces in a server, either turn on ARP
|
||||
filtering by
|
||||
|
||||
(1) entering: echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
|
||||
(this only works if your kernel's version is higher than 2.4.5), or
|
||||
|
||||
(2) installing the interfaces in separate broadcast domains (either
|
||||
in different switches or in a switch partitioned to VLANs).
|
||||
|
||||
|
||||
Support
|
||||
=======
|
||||
|
||||
For general information, go to the Intel support website at:
|
||||
|
||||
http://support.intel.com
|
||||
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
|
||||
http://sourceforge.net/projects/e1000
|
||||
|
||||
If an issue is identified with the released source code on the supported
|
||||
kernel with a supported adapter, email the specific information related to the
|
||||
issue to e1000-devel@lists.sourceforge.net.
|
||||
|
||||
|
||||
License
|
||||
=======
|
||||
|
||||
This software program is released under the terms of a license agreement
|
||||
between you ('Licensee') and Intel. Do not use or load this software or any
|
||||
associated materials (collectively, the 'Software') until you have carefully
|
||||
read the full terms and conditions of the file COPYING located in this software
|
||||
package. By loading or using the Software, you agree to the terms of this
|
||||
Agreement. If you do not agree with the terms of this Agreement, do not install
|
||||
or use the Software.
|
||||
|
||||
* Other names and brands may be claimed as the property of others.
|
||||
461
Documentation/networking/e1000.txt
Normal file
461
Documentation/networking/e1000.txt
Normal file
|
|
@ -0,0 +1,461 @@
|
|||
Linux* Base Driver for Intel(R) Ethernet Network Connection
|
||||
===========================================================
|
||||
|
||||
Intel Gigabit Linux driver.
|
||||
Copyright(c) 1999 - 2013 Intel Corporation.
|
||||
|
||||
Contents
|
||||
========
|
||||
|
||||
- Identifying Your Adapter
|
||||
- Command Line Parameters
|
||||
- Speed and Duplex Configuration
|
||||
- Additional Configurations
|
||||
- Support
|
||||
|
||||
Identifying Your Adapter
|
||||
========================
|
||||
|
||||
For more information on how to identify your adapter, go to the Adapter &
|
||||
Driver ID Guide at:
|
||||
|
||||
http://support.intel.com/support/go/network/adapter/idguide.htm
|
||||
|
||||
For the latest Intel network drivers for Linux, refer to the following
|
||||
website. In the search field, enter your adapter name or type, or use the
|
||||
networking link on the left to search for your adapter:
|
||||
|
||||
http://support.intel.com/support/go/network/adapter/home.htm
|
||||
|
||||
Command Line Parameters
|
||||
=======================
|
||||
|
||||
The default value for each parameter is generally the recommended setting,
|
||||
unless otherwise noted.
|
||||
|
||||
NOTES: For more information about the AutoNeg, Duplex, and Speed
|
||||
parameters, see the "Speed and Duplex Configuration" section in
|
||||
this document.
|
||||
|
||||
For more information about the InterruptThrottleRate,
|
||||
RxIntDelay, TxIntDelay, RxAbsIntDelay, and TxAbsIntDelay
|
||||
parameters, see the application note at:
|
||||
http://www.intel.com/design/network/applnots/ap450.htm
|
||||
|
||||
AutoNeg
|
||||
-------
|
||||
(Supported only on adapters with copper connections)
|
||||
Valid Range: 0x01-0x0F, 0x20-0x2F
|
||||
Default Value: 0x2F
|
||||
|
||||
This parameter is a bit-mask that specifies the speed and duplex settings
|
||||
advertised by the adapter. When this parameter is used, the Speed and
|
||||
Duplex parameters must not be specified.
|
||||
|
||||
NOTE: Refer to the Speed and Duplex section of this readme for more
|
||||
information on the AutoNeg parameter.
|
||||
|
||||
Duplex
|
||||
------
|
||||
(Supported only on adapters with copper connections)
|
||||
Valid Range: 0-2 (0=auto-negotiate, 1=half, 2=full)
|
||||
Default Value: 0
|
||||
|
||||
This defines the direction in which data is allowed to flow. Can be
|
||||
either one or two-directional. If both Duplex and the link partner are
|
||||
set to auto-negotiate, the board auto-detects the correct duplex. If the
|
||||
link partner is forced (either full or half), Duplex defaults to half-
|
||||
duplex.
|
||||
|
||||
FlowControl
|
||||
-----------
|
||||
Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx)
|
||||
Default Value: Reads flow control settings from the EEPROM
|
||||
|
||||
This parameter controls the automatic generation(Tx) and response(Rx)
|
||||
to Ethernet PAUSE frames.
|
||||
|
||||
InterruptThrottleRate
|
||||
---------------------
|
||||
(not supported on Intel(R) 82542, 82543 or 82544-based adapters)
|
||||
Valid Range: 0,1,3,4,100-100000 (0=off, 1=dynamic, 3=dynamic conservative,
|
||||
4=simplified balancing)
|
||||
Default Value: 3
|
||||
|
||||
The driver can limit the amount of interrupts per second that the adapter
|
||||
will generate for incoming packets. It does this by writing a value to the
|
||||
adapter that is based on the maximum amount of interrupts that the adapter
|
||||
will generate per second.
|
||||
|
||||
Setting InterruptThrottleRate to a value greater or equal to 100
|
||||
will program the adapter to send out a maximum of that many interrupts
|
||||
per second, even if more packets have come in. This reduces interrupt
|
||||
load on the system and can lower CPU utilization under heavy load,
|
||||
but will increase latency as packets are not processed as quickly.
|
||||
|
||||
The default behaviour of the driver previously assumed a static
|
||||
InterruptThrottleRate value of 8000, providing a good fallback value for
|
||||
all traffic types,but lacking in small packet performance and latency.
|
||||
The hardware can handle many more small packets per second however, and
|
||||
for this reason an adaptive interrupt moderation algorithm was implemented.
|
||||
|
||||
Since 7.3.x, the driver has two adaptive modes (setting 1 or 3) in which
|
||||
it dynamically adjusts the InterruptThrottleRate value based on the traffic
|
||||
that it receives. After determining the type of incoming traffic in the last
|
||||
timeframe, it will adjust the InterruptThrottleRate to an appropriate value
|
||||
for that traffic.
|
||||
|
||||
The algorithm classifies the incoming traffic every interval into
|
||||
classes. Once the class is determined, the InterruptThrottleRate value is
|
||||
adjusted to suit that traffic type the best. There are three classes defined:
|
||||
"Bulk traffic", for large amounts of packets of normal size; "Low latency",
|
||||
for small amounts of traffic and/or a significant percentage of small
|
||||
packets; and "Lowest latency", for almost completely small packets or
|
||||
minimal traffic.
|
||||
|
||||
In dynamic conservative mode, the InterruptThrottleRate value is set to 4000
|
||||
for traffic that falls in class "Bulk traffic". If traffic falls in the "Low
|
||||
latency" or "Lowest latency" class, the InterruptThrottleRate is increased
|
||||
stepwise to 20000. This default mode is suitable for most applications.
|
||||
|
||||
For situations where low latency is vital such as cluster or
|
||||
grid computing, the algorithm can reduce latency even more when
|
||||
InterruptThrottleRate is set to mode 1. In this mode, which operates
|
||||
the same as mode 3, the InterruptThrottleRate will be increased stepwise to
|
||||
70000 for traffic in class "Lowest latency".
|
||||
|
||||
In simplified mode the interrupt rate is based on the ratio of TX and
|
||||
RX traffic. If the bytes per second rate is approximately equal, the
|
||||
interrupt rate will drop as low as 2000 interrupts per second. If the
|
||||
traffic is mostly transmit or mostly receive, the interrupt rate could
|
||||
be as high as 8000.
|
||||
|
||||
Setting InterruptThrottleRate to 0 turns off any interrupt moderation
|
||||
and may improve small packet latency, but is generally not suitable
|
||||
for bulk throughput traffic.
|
||||
|
||||
NOTE: InterruptThrottleRate takes precedence over the TxAbsIntDelay and
|
||||
RxAbsIntDelay parameters. In other words, minimizing the receive
|
||||
and/or transmit absolute delays does not force the controller to
|
||||
generate more interrupts than what the Interrupt Throttle Rate
|
||||
allows.
|
||||
|
||||
CAUTION: If you are using the Intel(R) PRO/1000 CT Network Connection
|
||||
(controller 82547), setting InterruptThrottleRate to a value
|
||||
greater than 75,000, may hang (stop transmitting) adapters
|
||||
under certain network conditions. If this occurs a NETDEV
|
||||
WATCHDOG message is logged in the system event log. In
|
||||
addition, the controller is automatically reset, restoring
|
||||
the network connection. To eliminate the potential for the
|
||||
hang, ensure that InterruptThrottleRate is set no greater
|
||||
than 75,000 and is not set to 0.
|
||||
|
||||
NOTE: When e1000 is loaded with default settings and multiple adapters
|
||||
are in use simultaneously, the CPU utilization may increase non-
|
||||
linearly. In order to limit the CPU utilization without impacting
|
||||
the overall throughput, we recommend that you load the driver as
|
||||
follows:
|
||||
|
||||
modprobe e1000 InterruptThrottleRate=3000,3000,3000
|
||||
|
||||
This sets the InterruptThrottleRate to 3000 interrupts/sec for
|
||||
the first, second, and third instances of the driver. The range
|
||||
of 2000 to 3000 interrupts per second works on a majority of
|
||||
systems and is a good starting point, but the optimal value will
|
||||
be platform-specific. If CPU utilization is not a concern, use
|
||||
RX_POLLING (NAPI) and default driver settings.
|
||||
|
||||
RxDescriptors
|
||||
-------------
|
||||
Valid Range: 80-256 for 82542 and 82543-based adapters
|
||||
80-4096 for all other supported adapters
|
||||
Default Value: 256
|
||||
|
||||
This value specifies the number of receive buffer descriptors allocated
|
||||
by the driver. Increasing this value allows the driver to buffer more
|
||||
incoming packets, at the expense of increased system memory utilization.
|
||||
|
||||
Each descriptor is 16 bytes. A receive buffer is also allocated for each
|
||||
descriptor and can be either 2048, 4096, 8192, or 16384 bytes, depending
|
||||
on the MTU setting. The maximum MTU size is 16110.
|
||||
|
||||
NOTE: MTU designates the frame size. It only needs to be set for Jumbo
|
||||
Frames. Depending on the available system resources, the request
|
||||
for a higher number of receive descriptors may be denied. In this
|
||||
case, use a lower number.
|
||||
|
||||
RxIntDelay
|
||||
----------
|
||||
Valid Range: 0-65535 (0=off)
|
||||
Default Value: 0
|
||||
|
||||
This value delays the generation of receive interrupts in units of 1.024
|
||||
microseconds. Receive interrupt reduction can improve CPU efficiency if
|
||||
properly tuned for specific network traffic. Increasing this value adds
|
||||
extra latency to frame reception and can end up decreasing the throughput
|
||||
of TCP traffic. If the system is reporting dropped receives, this value
|
||||
may be set too high, causing the driver to run out of available receive
|
||||
descriptors.
|
||||
|
||||
CAUTION: When setting RxIntDelay to a value other than 0, adapters may
|
||||
hang (stop transmitting) under certain network conditions. If
|
||||
this occurs a NETDEV WATCHDOG message is logged in the system
|
||||
event log. In addition, the controller is automatically reset,
|
||||
restoring the network connection. To eliminate the potential
|
||||
for the hang ensure that RxIntDelay is set to 0.
|
||||
|
||||
RxAbsIntDelay
|
||||
-------------
|
||||
(This parameter is supported only on 82540, 82545 and later adapters.)
|
||||
Valid Range: 0-65535 (0=off)
|
||||
Default Value: 128
|
||||
|
||||
This value, in units of 1.024 microseconds, limits the delay in which a
|
||||
receive interrupt is generated. Useful only if RxIntDelay is non-zero,
|
||||
this value ensures that an interrupt is generated after the initial
|
||||
packet is received within the set amount of time. Proper tuning,
|
||||
along with RxIntDelay, may improve traffic throughput in specific network
|
||||
conditions.
|
||||
|
||||
Speed
|
||||
-----
|
||||
(This parameter is supported only on adapters with copper connections.)
|
||||
Valid Settings: 0, 10, 100, 1000
|
||||
Default Value: 0 (auto-negotiate at all supported speeds)
|
||||
|
||||
Speed forces the line speed to the specified value in megabits per second
|
||||
(Mbps). If this parameter is not specified or is set to 0 and the link
|
||||
partner is set to auto-negotiate, the board will auto-detect the correct
|
||||
speed. Duplex should also be set when Speed is set to either 10 or 100.
|
||||
|
||||
TxDescriptors
|
||||
-------------
|
||||
Valid Range: 80-256 for 82542 and 82543-based adapters
|
||||
80-4096 for all other supported adapters
|
||||
Default Value: 256
|
||||
|
||||
This value is the number of transmit descriptors allocated by the driver.
|
||||
Increasing this value allows the driver to queue more transmits. Each
|
||||
descriptor is 16 bytes.
|
||||
|
||||
NOTE: Depending on the available system resources, the request for a
|
||||
higher number of transmit descriptors may be denied. In this case,
|
||||
use a lower number.
|
||||
|
||||
TxDescriptorStep
|
||||
----------------
|
||||
Valid Range: 1 (use every Tx Descriptor)
|
||||
4 (use every 4th Tx Descriptor)
|
||||
|
||||
Default Value: 1 (use every Tx Descriptor)
|
||||
|
||||
On certain non-Intel architectures, it has been observed that intense TX
|
||||
traffic bursts of short packets may result in an improper descriptor
|
||||
writeback. If this occurs, the driver will report a "TX Timeout" and reset
|
||||
the adapter, after which the transmit flow will restart, though data may
|
||||
have stalled for as much as 10 seconds before it resumes.
|
||||
|
||||
The improper writeback does not occur on the first descriptor in a system
|
||||
memory cache-line, which is typically 32 bytes, or 4 descriptors long.
|
||||
|
||||
Setting TxDescriptorStep to a value of 4 will ensure that all TX descriptors
|
||||
are aligned to the start of a system memory cache line, and so this problem
|
||||
will not occur.
|
||||
|
||||
NOTES: Setting TxDescriptorStep to 4 effectively reduces the number of
|
||||
TxDescriptors available for transmits to 1/4 of the normal allocation.
|
||||
This has a possible negative performance impact, which may be
|
||||
compensated for by allocating more descriptors using the TxDescriptors
|
||||
module parameter.
|
||||
|
||||
There are other conditions which may result in "TX Timeout", which will
|
||||
not be resolved by the use of the TxDescriptorStep parameter. As the
|
||||
issue addressed by this parameter has never been observed on Intel
|
||||
Architecture platforms, it should not be used on Intel platforms.
|
||||
|
||||
TxIntDelay
|
||||
----------
|
||||
Valid Range: 0-65535 (0=off)
|
||||
Default Value: 64
|
||||
|
||||
This value delays the generation of transmit interrupts in units of
|
||||
1.024 microseconds. Transmit interrupt reduction can improve CPU
|
||||
efficiency if properly tuned for specific network traffic. If the
|
||||
system is reporting dropped transmits, this value may be set too high
|
||||
causing the driver to run out of available transmit descriptors.
|
||||
|
||||
TxAbsIntDelay
|
||||
-------------
|
||||
(This parameter is supported only on 82540, 82545 and later adapters.)
|
||||
Valid Range: 0-65535 (0=off)
|
||||
Default Value: 64
|
||||
|
||||
This value, in units of 1.024 microseconds, limits the delay in which a
|
||||
transmit interrupt is generated. Useful only if TxIntDelay is non-zero,
|
||||
this value ensures that an interrupt is generated after the initial
|
||||
packet is sent on the wire within the set amount of time. Proper tuning,
|
||||
along with TxIntDelay, may improve traffic throughput in specific
|
||||
network conditions.
|
||||
|
||||
XsumRX
|
||||
------
|
||||
(This parameter is NOT supported on the 82542-based adapter.)
|
||||
Valid Range: 0-1
|
||||
Default Value: 1
|
||||
|
||||
A value of '1' indicates that the driver should enable IP checksum
|
||||
offload for received packets (both UDP and TCP) to the adapter hardware.
|
||||
|
||||
Copybreak
|
||||
---------
|
||||
Valid Range: 0-xxxxxxx (0=off)
|
||||
Default Value: 256
|
||||
Usage: insmod e1000.ko copybreak=128
|
||||
|
||||
Driver copies all packets below or equaling this size to a fresh RX
|
||||
buffer before handing it up the stack.
|
||||
|
||||
This parameter is different than other parameters, in that it is a
|
||||
single (not 1,1,1 etc.) parameter applied to all driver instances and
|
||||
it is also available during runtime at
|
||||
/sys/module/e1000/parameters/copybreak
|
||||
|
||||
SmartPowerDownEnable
|
||||
--------------------
|
||||
Valid Range: 0-1
|
||||
Default Value: 0 (disabled)
|
||||
|
||||
Allows PHY to turn off in lower power states. The user can turn off
|
||||
this parameter in supported chipsets.
|
||||
|
||||
KumeranLockLoss
|
||||
---------------
|
||||
Valid Range: 0-1
|
||||
Default Value: 1 (enabled)
|
||||
|
||||
This workaround skips resetting the PHY at shutdown for the initial
|
||||
silicon releases of ICH8 systems.
|
||||
|
||||
Speed and Duplex Configuration
|
||||
==============================
|
||||
|
||||
Three keywords are used to control the speed and duplex configuration.
|
||||
These keywords are Speed, Duplex, and AutoNeg.
|
||||
|
||||
If the board uses a fiber interface, these keywords are ignored, and the
|
||||
fiber interface board only links at 1000 Mbps full-duplex.
|
||||
|
||||
For copper-based boards, the keywords interact as follows:
|
||||
|
||||
The default operation is auto-negotiate. The board advertises all
|
||||
supported speed and duplex combinations, and it links at the highest
|
||||
common speed and duplex mode IF the link partner is set to auto-negotiate.
|
||||
|
||||
If Speed = 1000, limited auto-negotiation is enabled and only 1000 Mbps
|
||||
is advertised (The 1000BaseT spec requires auto-negotiation.)
|
||||
|
||||
If Speed = 10 or 100, then both Speed and Duplex should be set. Auto-
|
||||
negotiation is disabled, and the AutoNeg parameter is ignored. Partner
|
||||
SHOULD also be forced.
|
||||
|
||||
The AutoNeg parameter is used when more control is required over the
|
||||
auto-negotiation process. It should be used when you wish to control which
|
||||
speed and duplex combinations are advertised during the auto-negotiation
|
||||
process.
|
||||
|
||||
The parameter may be specified as either a decimal or hexadecimal value as
|
||||
determined by the bitmap below.
|
||||
|
||||
Bit position 7 6 5 4 3 2 1 0
|
||||
Decimal Value 128 64 32 16 8 4 2 1
|
||||
Hex value 80 40 20 10 8 4 2 1
|
||||
Speed (Mbps) N/A N/A 1000 N/A 100 100 10 10
|
||||
Duplex Full Full Half Full Half
|
||||
|
||||
Some examples of using AutoNeg:
|
||||
|
||||
modprobe e1000 AutoNeg=0x01 (Restricts autonegotiation to 10 Half)
|
||||
modprobe e1000 AutoNeg=1 (Same as above)
|
||||
modprobe e1000 AutoNeg=0x02 (Restricts autonegotiation to 10 Full)
|
||||
modprobe e1000 AutoNeg=0x03 (Restricts autonegotiation to 10 Half or 10 Full)
|
||||
modprobe e1000 AutoNeg=0x04 (Restricts autonegotiation to 100 Half)
|
||||
modprobe e1000 AutoNeg=0x05 (Restricts autonegotiation to 10 Half or 100
|
||||
Half)
|
||||
modprobe e1000 AutoNeg=0x020 (Restricts autonegotiation to 1000 Full)
|
||||
modprobe e1000 AutoNeg=32 (Same as above)
|
||||
|
||||
Note that when this parameter is used, Speed and Duplex must not be specified.
|
||||
|
||||
If the link partner is forced to a specific speed and duplex, then this
|
||||
parameter should not be used. Instead, use the Speed and Duplex parameters
|
||||
previously mentioned to force the adapter to the same speed and duplex.
|
||||
|
||||
Additional Configurations
|
||||
=========================
|
||||
|
||||
Jumbo Frames
|
||||
------------
|
||||
Jumbo Frames support is enabled by changing the MTU to a value larger than
|
||||
the default of 1500. Use the ifconfig command to increase the MTU size.
|
||||
For example:
|
||||
|
||||
ifconfig eth<x> mtu 9000 up
|
||||
|
||||
This setting is not saved across reboots. It can be made permanent if
|
||||
you add:
|
||||
|
||||
MTU=9000
|
||||
|
||||
to the file /etc/sysconfig/network-scripts/ifcfg-eth<x>. This example
|
||||
applies to the Red Hat distributions; other distributions may store this
|
||||
setting in a different location.
|
||||
|
||||
Notes:
|
||||
Degradation in throughput performance may be observed in some Jumbo frames
|
||||
environments. If this is observed, increasing the application's socket buffer
|
||||
size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values may help.
|
||||
See the specific application manual and /usr/src/linux*/Documentation/
|
||||
networking/ip-sysctl.txt for more details.
|
||||
|
||||
- The maximum MTU setting for Jumbo Frames is 16110. This value coincides
|
||||
with the maximum Jumbo Frames size of 16128.
|
||||
|
||||
- Using Jumbo frames at 10 or 100 Mbps is not supported and may result in
|
||||
poor performance or loss of link.
|
||||
|
||||
- Adapters based on the Intel(R) 82542 and 82573V/E controller do not
|
||||
support Jumbo Frames. These correspond to the following product names:
|
||||
Intel(R) PRO/1000 Gigabit Server Adapter
|
||||
Intel(R) PRO/1000 PM Network Connection
|
||||
|
||||
ethtool
|
||||
-------
|
||||
The driver utilizes the ethtool interface for driver configuration and
|
||||
diagnostics, as well as displaying statistical information. The ethtool
|
||||
version 1.6 or later is required for this functionality.
|
||||
|
||||
The latest release of ethtool can be found from
|
||||
http://ftp.kernel.org/pub/software/network/ethtool/
|
||||
|
||||
Enabling Wake on LAN* (WoL)
|
||||
---------------------------
|
||||
WoL is configured through the ethtool* utility.
|
||||
|
||||
WoL will be enabled on the system during the next shut down or reboot.
|
||||
For this driver version, in order to enable WoL, the e1000 driver must be
|
||||
loaded when shutting down or rebooting the system.
|
||||
|
||||
Support
|
||||
=======
|
||||
|
||||
For general information, go to the Intel support website at:
|
||||
|
||||
http://support.intel.com
|
||||
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
|
||||
http://sourceforge.net/projects/e1000
|
||||
|
||||
If an issue is identified with the released source code on the supported
|
||||
kernel with a supported adapter, email the specific information related
|
||||
to the issue to e1000-devel@lists.sf.net
|
||||
312
Documentation/networking/e1000e.txt
Normal file
312
Documentation/networking/e1000e.txt
Normal file
|
|
@ -0,0 +1,312 @@
|
|||
Linux* Driver for Intel(R) Ethernet Network Connection
|
||||
======================================================
|
||||
|
||||
Intel Gigabit Linux driver.
|
||||
Copyright(c) 1999 - 2013 Intel Corporation.
|
||||
|
||||
Contents
|
||||
========
|
||||
|
||||
- Identifying Your Adapter
|
||||
- Command Line Parameters
|
||||
- Additional Configurations
|
||||
- Support
|
||||
|
||||
Identifying Your Adapter
|
||||
========================
|
||||
|
||||
The e1000e driver supports all PCI Express Intel(R) Gigabit Network
|
||||
Connections, except those that are 82575, 82576 and 82580-based*.
|
||||
|
||||
* NOTE: The Intel(R) PRO/1000 P Dual Port Server Adapter is supported by
|
||||
the e1000 driver, not the e1000e driver due to the 82546 part being used
|
||||
behind a PCI Express bridge.
|
||||
|
||||
For more information on how to identify your adapter, go to the Adapter &
|
||||
Driver ID Guide at:
|
||||
|
||||
http://support.intel.com/support/go/network/adapter/idguide.htm
|
||||
|
||||
For the latest Intel network drivers for Linux, refer to the following
|
||||
website. In the search field, enter your adapter name or type, or use the
|
||||
networking link on the left to search for your adapter:
|
||||
|
||||
http://support.intel.com/support/go/network/adapter/home.htm
|
||||
|
||||
Command Line Parameters
|
||||
=======================
|
||||
|
||||
The default value for each parameter is generally the recommended setting,
|
||||
unless otherwise noted.
|
||||
|
||||
NOTES: For more information about the InterruptThrottleRate,
|
||||
RxIntDelay, TxIntDelay, RxAbsIntDelay, and TxAbsIntDelay
|
||||
parameters, see the application note at:
|
||||
http://www.intel.com/design/network/applnots/ap450.htm
|
||||
|
||||
InterruptThrottleRate
|
||||
---------------------
|
||||
Valid Range: 0,1,3,4,100-100000 (0=off, 1=dynamic, 3=dynamic conservative,
|
||||
4=simplified balancing)
|
||||
Default Value: 3
|
||||
|
||||
The driver can limit the amount of interrupts per second that the adapter
|
||||
will generate for incoming packets. It does this by writing a value to the
|
||||
adapter that is based on the maximum amount of interrupts that the adapter
|
||||
will generate per second.
|
||||
|
||||
Setting InterruptThrottleRate to a value greater or equal to 100
|
||||
will program the adapter to send out a maximum of that many interrupts
|
||||
per second, even if more packets have come in. This reduces interrupt
|
||||
load on the system and can lower CPU utilization under heavy load,
|
||||
but will increase latency as packets are not processed as quickly.
|
||||
|
||||
The default behaviour of the driver previously assumed a static
|
||||
InterruptThrottleRate value of 8000, providing a good fallback value for
|
||||
all traffic types, but lacking in small packet performance and latency.
|
||||
The hardware can handle many more small packets per second however, and
|
||||
for this reason an adaptive interrupt moderation algorithm was implemented.
|
||||
|
||||
The driver has two adaptive modes (setting 1 or 3) in which
|
||||
it dynamically adjusts the InterruptThrottleRate value based on the traffic
|
||||
that it receives. After determining the type of incoming traffic in the last
|
||||
timeframe, it will adjust the InterruptThrottleRate to an appropriate value
|
||||
for that traffic.
|
||||
|
||||
The algorithm classifies the incoming traffic every interval into
|
||||
classes. Once the class is determined, the InterruptThrottleRate value is
|
||||
adjusted to suit that traffic type the best. There are three classes defined:
|
||||
"Bulk traffic", for large amounts of packets of normal size; "Low latency",
|
||||
for small amounts of traffic and/or a significant percentage of small
|
||||
packets; and "Lowest latency", for almost completely small packets or
|
||||
minimal traffic.
|
||||
|
||||
In dynamic conservative mode, the InterruptThrottleRate value is set to 4000
|
||||
for traffic that falls in class "Bulk traffic". If traffic falls in the "Low
|
||||
latency" or "Lowest latency" class, the InterruptThrottleRate is increased
|
||||
stepwise to 20000. This default mode is suitable for most applications.
|
||||
|
||||
For situations where low latency is vital such as cluster or
|
||||
grid computing, the algorithm can reduce latency even more when
|
||||
InterruptThrottleRate is set to mode 1. In this mode, which operates
|
||||
the same as mode 3, the InterruptThrottleRate will be increased stepwise to
|
||||
70000 for traffic in class "Lowest latency".
|
||||
|
||||
In simplified mode the interrupt rate is based on the ratio of TX and
|
||||
RX traffic. If the bytes per second rate is approximately equal, the
|
||||
interrupt rate will drop as low as 2000 interrupts per second. If the
|
||||
traffic is mostly transmit or mostly receive, the interrupt rate could
|
||||
be as high as 8000.
|
||||
|
||||
Setting InterruptThrottleRate to 0 turns off any interrupt moderation
|
||||
and may improve small packet latency, but is generally not suitable
|
||||
for bulk throughput traffic.
|
||||
|
||||
NOTE: InterruptThrottleRate takes precedence over the TxAbsIntDelay and
|
||||
RxAbsIntDelay parameters. In other words, minimizing the receive
|
||||
and/or transmit absolute delays does not force the controller to
|
||||
generate more interrupts than what the Interrupt Throttle Rate
|
||||
allows.
|
||||
|
||||
NOTE: When e1000e is loaded with default settings and multiple adapters
|
||||
are in use simultaneously, the CPU utilization may increase non-
|
||||
linearly. In order to limit the CPU utilization without impacting
|
||||
the overall throughput, we recommend that you load the driver as
|
||||
follows:
|
||||
|
||||
modprobe e1000e InterruptThrottleRate=3000,3000,3000
|
||||
|
||||
This sets the InterruptThrottleRate to 3000 interrupts/sec for
|
||||
the first, second, and third instances of the driver. The range
|
||||
of 2000 to 3000 interrupts per second works on a majority of
|
||||
systems and is a good starting point, but the optimal value will
|
||||
be platform-specific. If CPU utilization is not a concern, use
|
||||
RX_POLLING (NAPI) and default driver settings.
|
||||
|
||||
RxIntDelay
|
||||
----------
|
||||
Valid Range: 0-65535 (0=off)
|
||||
Default Value: 0
|
||||
|
||||
This value delays the generation of receive interrupts in units of 1.024
|
||||
microseconds. Receive interrupt reduction can improve CPU efficiency if
|
||||
properly tuned for specific network traffic. Increasing this value adds
|
||||
extra latency to frame reception and can end up decreasing the throughput
|
||||
of TCP traffic. If the system is reporting dropped receives, this value
|
||||
may be set too high, causing the driver to run out of available receive
|
||||
descriptors.
|
||||
|
||||
CAUTION: When setting RxIntDelay to a value other than 0, adapters may
|
||||
hang (stop transmitting) under certain network conditions. If
|
||||
this occurs a NETDEV WATCHDOG message is logged in the system
|
||||
event log. In addition, the controller is automatically reset,
|
||||
restoring the network connection. To eliminate the potential
|
||||
for the hang ensure that RxIntDelay is set to 0.
|
||||
|
||||
RxAbsIntDelay
|
||||
-------------
|
||||
Valid Range: 0-65535 (0=off)
|
||||
Default Value: 8
|
||||
|
||||
This value, in units of 1.024 microseconds, limits the delay in which a
|
||||
receive interrupt is generated. Useful only if RxIntDelay is non-zero,
|
||||
this value ensures that an interrupt is generated after the initial
|
||||
packet is received within the set amount of time. Proper tuning,
|
||||
along with RxIntDelay, may improve traffic throughput in specific network
|
||||
conditions.
|
||||
|
||||
TxIntDelay
|
||||
----------
|
||||
Valid Range: 0-65535 (0=off)
|
||||
Default Value: 8
|
||||
|
||||
This value delays the generation of transmit interrupts in units of
|
||||
1.024 microseconds. Transmit interrupt reduction can improve CPU
|
||||
efficiency if properly tuned for specific network traffic. If the
|
||||
system is reporting dropped transmits, this value may be set too high
|
||||
causing the driver to run out of available transmit descriptors.
|
||||
|
||||
TxAbsIntDelay
|
||||
-------------
|
||||
Valid Range: 0-65535 (0=off)
|
||||
Default Value: 32
|
||||
|
||||
This value, in units of 1.024 microseconds, limits the delay in which a
|
||||
transmit interrupt is generated. Useful only if TxIntDelay is non-zero,
|
||||
this value ensures that an interrupt is generated after the initial
|
||||
packet is sent on the wire within the set amount of time. Proper tuning,
|
||||
along with TxIntDelay, may improve traffic throughput in specific
|
||||
network conditions.
|
||||
|
||||
Copybreak
|
||||
---------
|
||||
Valid Range: 0-xxxxxxx (0=off)
|
||||
Default Value: 256
|
||||
|
||||
Driver copies all packets below or equaling this size to a fresh RX
|
||||
buffer before handing it up the stack.
|
||||
|
||||
This parameter is different than other parameters, in that it is a
|
||||
single (not 1,1,1 etc.) parameter applied to all driver instances and
|
||||
it is also available during runtime at
|
||||
/sys/module/e1000e/parameters/copybreak
|
||||
|
||||
SmartPowerDownEnable
|
||||
--------------------
|
||||
Valid Range: 0-1
|
||||
Default Value: 0 (disabled)
|
||||
|
||||
Allows PHY to turn off in lower power states. The user can set this parameter
|
||||
in supported chipsets.
|
||||
|
||||
KumeranLockLoss
|
||||
---------------
|
||||
Valid Range: 0-1
|
||||
Default Value: 1 (enabled)
|
||||
|
||||
This workaround skips resetting the PHY at shutdown for the initial
|
||||
silicon releases of ICH8 systems.
|
||||
|
||||
IntMode
|
||||
-------
|
||||
Valid Range: 0-2 (0=legacy, 1=MSI, 2=MSI-X)
|
||||
Default Value: 2
|
||||
|
||||
Allows changing the interrupt mode at module load time, without requiring a
|
||||
recompile. If the driver load fails to enable a specific interrupt mode, the
|
||||
driver will try other interrupt modes, from least to most compatible. The
|
||||
interrupt order is MSI-X, MSI, Legacy. If specifying MSI (IntMode=1)
|
||||
interrupts, only MSI and Legacy will be attempted.
|
||||
|
||||
CrcStripping
|
||||
------------
|
||||
Valid Range: 0-1
|
||||
Default Value: 1 (enabled)
|
||||
|
||||
Strip the CRC from received packets before sending up the network stack. If
|
||||
you have a machine with a BMC enabled but cannot receive IPMI traffic after
|
||||
loading or enabling the driver, try disabling this feature.
|
||||
|
||||
WriteProtectNVM
|
||||
---------------
|
||||
Valid Range: 0,1
|
||||
Default Value: 1
|
||||
|
||||
If set to 1, configure the hardware to ignore all write/erase cycles to the
|
||||
GbE region in the ICHx NVM (in order to prevent accidental corruption of the
|
||||
NVM). This feature can be disabled by setting the parameter to 0 during initial
|
||||
driver load.
|
||||
NOTE: The machine must be power cycled (full off/on) when enabling NVM writes
|
||||
via setting the parameter to zero. Once the NVM has been locked (via the
|
||||
parameter at 1 when the driver loads) it cannot be unlocked except via power
|
||||
cycle.
|
||||
|
||||
Additional Configurations
|
||||
=========================
|
||||
|
||||
Jumbo Frames
|
||||
------------
|
||||
Jumbo Frames support is enabled by changing the MTU to a value larger than
|
||||
the default of 1500. Use the ifconfig command to increase the MTU size.
|
||||
For example:
|
||||
|
||||
ifconfig eth<x> mtu 9000 up
|
||||
|
||||
This setting is not saved across reboots.
|
||||
|
||||
Notes:
|
||||
|
||||
- The maximum MTU setting for Jumbo Frames is 9216. This value coincides
|
||||
with the maximum Jumbo Frames size of 9234 bytes.
|
||||
|
||||
- Using Jumbo frames at 10 or 100 Mbps is not supported and may result in
|
||||
poor performance or loss of link.
|
||||
|
||||
- Some adapters limit Jumbo Frames sized packets to a maximum of
|
||||
4096 bytes and some adapters do not support Jumbo Frames.
|
||||
|
||||
- Jumbo Frames cannot be configured on an 82579-based Network device, if
|
||||
MACSec is enabled on the system.
|
||||
|
||||
ethtool
|
||||
-------
|
||||
The driver utilizes the ethtool interface for driver configuration and
|
||||
diagnostics, as well as displaying statistical information. We
|
||||
strongly recommend downloading the latest version of ethtool at:
|
||||
|
||||
http://ftp.kernel.org/pub/software/network/ethtool/
|
||||
|
||||
NOTE: When validating enable/disable tests on some parts (82578, for example)
|
||||
you need to add a few seconds between tests when working with ethtool.
|
||||
|
||||
Speed and Duplex
|
||||
----------------
|
||||
Speed and Duplex are configured through the ethtool* utility. For
|
||||
instructions, refer to the ethtool man page.
|
||||
|
||||
Enabling Wake on LAN* (WoL)
|
||||
---------------------------
|
||||
WoL is configured through the ethtool* utility. For instructions on
|
||||
enabling WoL with ethtool, refer to the ethtool man page.
|
||||
|
||||
WoL will be enabled on the system during the next shut down or reboot.
|
||||
For this driver version, in order to enable WoL, the e1000e driver must be
|
||||
loaded when shutting down or rebooting the system.
|
||||
|
||||
In most cases Wake On LAN is only supported on port A for multiple port
|
||||
adapters. To verify if a port supports Wake on Lan run ethtool eth<X>.
|
||||
|
||||
Support
|
||||
=======
|
||||
|
||||
For general information, go to the Intel support website at:
|
||||
|
||||
www.intel.com/support/
|
||||
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
|
||||
http://sourceforge.net/projects/e1000
|
||||
|
||||
If an issue is identified with the released source code on the supported
|
||||
kernel with a supported adapter, email the specific information related
|
||||
to the issue to e1000-devel@lists.sf.net
|
||||
528
Documentation/networking/eql.txt
Normal file
528
Documentation/networking/eql.txt
Normal file
|
|
@ -0,0 +1,528 @@
|
|||
EQL Driver: Serial IP Load Balancing HOWTO
|
||||
Simon "Guru Aleph-Null" Janes, simon@ncm.com
|
||||
v1.1, February 27, 1995
|
||||
|
||||
This is the manual for the EQL device driver. EQL is a software device
|
||||
that lets you load-balance IP serial links (SLIP or uncompressed PPP)
|
||||
to increase your bandwidth. It will not reduce your latency (i.e. ping
|
||||
times) except in the case where you already have lots of traffic on
|
||||
your link, in which it will help them out. This driver has been tested
|
||||
with the 1.1.75 kernel, and is known to have patched cleanly with
|
||||
1.1.86. Some testing with 1.1.92 has been done with the v1.1 patch
|
||||
which was only created to patch cleanly in the very latest kernel
|
||||
source trees. (Yes, it worked fine.)
|
||||
|
||||
1. Introduction
|
||||
|
||||
Which is worse? A huge fee for a 56K leased line or two phone lines?
|
||||
It's probably the former. If you find yourself craving more bandwidth,
|
||||
and have a ISP that is flexible, it is now possible to bind modems
|
||||
together to work as one point-to-point link to increase your
|
||||
bandwidth. All without having to have a special black box on either
|
||||
side.
|
||||
|
||||
|
||||
The eql driver has only been tested with the Livingston PortMaster-2e
|
||||
terminal server. I do not know if other terminal servers support load-
|
||||
balancing, but I do know that the PortMaster does it, and does it
|
||||
almost as well as the eql driver seems to do it (-- Unfortunately, in
|
||||
my testing so far, the Livingston PortMaster 2e's load-balancing is a
|
||||
good 1 to 2 KB/s slower than the test machine working with a 28.8 Kbps
|
||||
and 14.4 Kbps connection. However, I am not sure that it really is
|
||||
the PortMaster, or if it's Linux's TCP drivers. I'm told that Linux's
|
||||
TCP implementation is pretty fast though.--)
|
||||
|
||||
|
||||
I suggest to ISPs out there that it would probably be fair to charge
|
||||
a load-balancing client 75% of the cost of the second line and 50% of
|
||||
the cost of the third line etc...
|
||||
|
||||
|
||||
Hey, we can all dream you know...
|
||||
|
||||
|
||||
2. Kernel Configuration
|
||||
|
||||
Here I describe the general steps of getting a kernel up and working
|
||||
with the eql driver. From patching, building, to installing.
|
||||
|
||||
|
||||
2.1. Patching The Kernel
|
||||
|
||||
If you do not have or cannot get a copy of the kernel with the eql
|
||||
driver folded into it, get your copy of the driver from
|
||||
ftp://slaughter.ncm.com/pub/Linux/LOAD_BALANCING/eql-1.1.tar.gz.
|
||||
Unpack this archive someplace obvious like /usr/local/src/. It will
|
||||
create the following files:
|
||||
|
||||
|
||||
|
||||
______________________________________________________________________
|
||||
-rw-r--r-- guru/ncm 198 Jan 19 18:53 1995 eql-1.1/NO-WARRANTY
|
||||
-rw-r--r-- guru/ncm 30620 Feb 27 21:40 1995 eql-1.1/eql-1.1.patch
|
||||
-rwxr-xr-x guru/ncm 16111 Jan 12 22:29 1995 eql-1.1/eql_enslave
|
||||
-rw-r--r-- guru/ncm 2195 Jan 10 21:48 1995 eql-1.1/eql_enslave.c
|
||||
______________________________________________________________________
|
||||
|
||||
Unpack a recent kernel (something after 1.1.92) someplace convenient
|
||||
like say /usr/src/linux-1.1.92.eql. Use symbolic links to point
|
||||
/usr/src/linux to this development directory.
|
||||
|
||||
|
||||
Apply the patch by running the commands:
|
||||
|
||||
|
||||
______________________________________________________________________
|
||||
cd /usr/src
|
||||
patch </usr/local/src/eql-1.1/eql-1.1.patch
|
||||
______________________________________________________________________
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
2.2. Building The Kernel
|
||||
|
||||
After patching the kernel, run make config and configure the kernel
|
||||
for your hardware.
|
||||
|
||||
|
||||
After configuration, make and install according to your habit.
|
||||
|
||||
|
||||
3. Network Configuration
|
||||
|
||||
So far, I have only used the eql device with the DSLIP SLIP connection
|
||||
manager by Matt Dillon (-- "The man who sold his soul to code so much
|
||||
so quickly."--) . How you configure it for other "connection"
|
||||
managers is up to you. Most other connection managers that I've seen
|
||||
don't do a very good job when it comes to handling more than one
|
||||
connection.
|
||||
|
||||
|
||||
3.1. /etc/rc.d/rc.inet1
|
||||
|
||||
In rc.inet1, ifconfig the eql device to the IP address you usually use
|
||||
for your machine, and the MTU you prefer for your SLIP lines. One
|
||||
could argue that MTU should be roughly half the usual size for two
|
||||
modems, one-third for three, one-fourth for four, etc... But going
|
||||
too far below 296 is probably overkill. Here is an example ifconfig
|
||||
command that sets up the eql device:
|
||||
|
||||
|
||||
|
||||
______________________________________________________________________
|
||||
ifconfig eql 198.67.33.239 mtu 1006
|
||||
______________________________________________________________________
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Once the eql device is up and running, add a static default route to
|
||||
it in the routing table using the cool new route syntax that makes
|
||||
life so much easier:
|
||||
|
||||
|
||||
|
||||
______________________________________________________________________
|
||||
route add default eql
|
||||
______________________________________________________________________
|
||||
|
||||
|
||||
3.2. Enslaving Devices By Hand
|
||||
|
||||
Enslaving devices by hand requires two utility programs: eql_enslave
|
||||
and eql_emancipate (-- eql_emancipate hasn't been written because when
|
||||
an enslaved device "dies", it is automatically taken out of the queue.
|
||||
I haven't found a good reason to write it yet... other than for
|
||||
completeness, but that isn't a good motivator is it?--)
|
||||
|
||||
|
||||
The syntax for enslaving a device is "eql_enslave <master-name>
|
||||
<slave-name> <estimated-bps>". Here are some example enslavings:
|
||||
|
||||
|
||||
|
||||
______________________________________________________________________
|
||||
eql_enslave eql sl0 28800
|
||||
eql_enslave eql ppp0 14400
|
||||
eql_enslave eql sl1 57600
|
||||
______________________________________________________________________
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
When you want to free a device from its life of slavery, you can
|
||||
either down the device with ifconfig (eql will automatically bury the
|
||||
dead slave and remove it from its queue) or use eql_emancipate to free
|
||||
it. (-- Or just ifconfig it down, and the eql driver will take it out
|
||||
for you.--)
|
||||
|
||||
|
||||
|
||||
______________________________________________________________________
|
||||
eql_emancipate eql sl0
|
||||
eql_emancipate eql ppp0
|
||||
eql_emancipate eql sl1
|
||||
______________________________________________________________________
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
3.3. DSLIP Configuration for the eql Device
|
||||
|
||||
The general idea is to bring up and keep up as many SLIP connections
|
||||
as you need, automatically.
|
||||
|
||||
|
||||
3.3.1. /etc/slip/runslip.conf
|
||||
|
||||
Here is an example runslip.conf:
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
______________________________________________________________________
|
||||
name sl-line-1
|
||||
enabled
|
||||
baud 38400
|
||||
mtu 576
|
||||
ducmd -e /etc/slip/dialout/cua2-288.xp -t 9
|
||||
command eql_enslave eql $interface 28800
|
||||
address 198.67.33.239
|
||||
line /dev/cua2
|
||||
|
||||
name sl-line-2
|
||||
enabled
|
||||
baud 38400
|
||||
mtu 576
|
||||
ducmd -e /etc/slip/dialout/cua3-288.xp -t 9
|
||||
command eql_enslave eql $interface 28800
|
||||
address 198.67.33.239
|
||||
line /dev/cua3
|
||||
______________________________________________________________________
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
3.4. Using PPP and the eql Device
|
||||
|
||||
I have not yet done any load-balancing testing for PPP devices, mainly
|
||||
because I don't have a PPP-connection manager like SLIP has with
|
||||
DSLIP. I did find a good tip from LinuxNET:Billy for PPP performance:
|
||||
make sure you have asyncmap set to something so that control
|
||||
characters are not escaped.
|
||||
|
||||
|
||||
I tried to fix up a PPP script/system for redialing lost PPP
|
||||
connections for use with the eql driver the weekend of Feb 25-26 '95
|
||||
(Hereafter known as the 8-hour PPP Hate Festival). Perhaps later this
|
||||
year.
|
||||
|
||||
|
||||
4. About the Slave Scheduler Algorithm
|
||||
|
||||
The slave scheduler probably could be replaced with a dozen other
|
||||
things and push traffic much faster. The formula in the current set
|
||||
up of the driver was tuned to handle slaves with wildly different
|
||||
bits-per-second "priorities".
|
||||
|
||||
|
||||
All testing I have done was with two 28.8 V.FC modems, one connecting
|
||||
at 28800 bps or slower, and the other connecting at 14400 bps all the
|
||||
time.
|
||||
|
||||
|
||||
One version of the scheduler was able to push 5.3 K/s through the
|
||||
28800 and 14400 connections, but when the priorities on the links were
|
||||
very wide apart (57600 vs. 14400) the "faster" modem received all
|
||||
traffic and the "slower" modem starved.
|
||||
|
||||
|
||||
5. Testers' Reports
|
||||
|
||||
Some people have experimented with the eql device with newer
|
||||
kernels (than 1.1.75). I have since updated the driver to patch
|
||||
cleanly in newer kernels because of the removal of the old "slave-
|
||||
balancing" driver config option.
|
||||
|
||||
|
||||
o icee from LinuxNET patched 1.1.86 without any rejects and was able
|
||||
to boot the kernel and enslave a couple of ISDN PPP links.
|
||||
|
||||
5.1. Randolph Bentson's Test Report
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
From bentson@grieg.seaslug.org Wed Feb 8 19:08:09 1995
|
||||
Date: Tue, 7 Feb 95 22:57 PST
|
||||
From: Randolph Bentson <bentson@grieg.seaslug.org>
|
||||
To: guru@ncm.com
|
||||
Subject: EQL driver tests
|
||||
|
||||
|
||||
I have been checking out your eql driver. (Nice work, that!)
|
||||
Although you may already done this performance testing, here
|
||||
are some data I've discovered.
|
||||
|
||||
Randolph Bentson
|
||||
bentson@grieg.seaslug.org
|
||||
|
||||
---------------------------------------------------------
|
||||
|
||||
|
||||
A pseudo-device driver, EQL, written by Simon Janes, can be used
|
||||
to bundle multiple SLIP connections into what appears to be a
|
||||
single connection. This allows one to improve dial-up network
|
||||
connectivity gradually, without having to buy expensive DSU/CSU
|
||||
hardware and services.
|
||||
|
||||
I have done some testing of this software, with two goals in
|
||||
mind: first, to ensure it actually works as described and
|
||||
second, as a method of exercising my device driver.
|
||||
|
||||
The following performance measurements were derived from a set
|
||||
of SLIP connections run between two Linux systems (1.1.84) using
|
||||
a 486DX2/66 with a Cyclom-8Ys and a 486SLC/40 with a Cyclom-16Y.
|
||||
(Ports 0,1,2,3 were used. A later configuration will distribute
|
||||
port selection across the different Cirrus chips on the boards.)
|
||||
Once a link was established, I timed a binary ftp transfer of
|
||||
289284 bytes of data. If there were no overhead (packet headers,
|
||||
inter-character and inter-packet delays, etc.) the transfers
|
||||
would take the following times:
|
||||
|
||||
bits/sec seconds
|
||||
345600 8.3
|
||||
234600 12.3
|
||||
172800 16.7
|
||||
153600 18.8
|
||||
76800 37.6
|
||||
57600 50.2
|
||||
38400 75.3
|
||||
28800 100.4
|
||||
19200 150.6
|
||||
9600 301.3
|
||||
|
||||
A single line running at the lower speeds and with large packets
|
||||
comes to within 2% of this. Performance is limited for the higher
|
||||
speeds (as predicted by the Cirrus databook) to an aggregate of
|
||||
about 160 kbits/sec. The next round of testing will distribute
|
||||
the load across two or more Cirrus chips.
|
||||
|
||||
The good news is that one gets nearly the full advantage of the
|
||||
second, third, and fourth line's bandwidth. (The bad news is
|
||||
that the connection establishment seemed fragile for the higher
|
||||
speeds. Once established, the connection seemed robust enough.)
|
||||
|
||||
#lines speed mtu seconds theory actual %of
|
||||
kbit/sec duration speed speed max
|
||||
3 115200 900 _ 345600
|
||||
3 115200 400 18.1 345600 159825 46
|
||||
2 115200 900 _ 230400
|
||||
2 115200 600 18.1 230400 159825 69
|
||||
2 115200 400 19.3 230400 149888 65
|
||||
4 57600 900 _ 234600
|
||||
4 57600 600 _ 234600
|
||||
4 57600 400 _ 234600
|
||||
3 57600 600 20.9 172800 138413 80
|
||||
3 57600 900 21.2 172800 136455 78
|
||||
3 115200 600 21.7 345600 133311 38
|
||||
3 57600 400 22.5 172800 128571 74
|
||||
4 38400 900 25.2 153600 114795 74
|
||||
4 38400 600 26.4 153600 109577 71
|
||||
4 38400 400 27.3 153600 105965 68
|
||||
2 57600 900 29.1 115200 99410.3 86
|
||||
1 115200 900 30.7 115200 94229.3 81
|
||||
2 57600 600 30.2 115200 95789.4 83
|
||||
3 38400 900 30.3 115200 95473.3 82
|
||||
3 38400 600 31.2 115200 92719.2 80
|
||||
1 115200 600 31.3 115200 92423 80
|
||||
2 57600 400 32.3 115200 89561.6 77
|
||||
1 115200 400 32.8 115200 88196.3 76
|
||||
3 38400 400 33.5 115200 86353.4 74
|
||||
2 38400 900 43.7 76800 66197.7 86
|
||||
2 38400 600 44 76800 65746.4 85
|
||||
2 38400 400 47.2 76800 61289 79
|
||||
4 19200 900 50.8 76800 56945.7 74
|
||||
4 19200 400 53.2 76800 54376.7 70
|
||||
4 19200 600 53.7 76800 53870.4 70
|
||||
1 57600 900 54.6 57600 52982.4 91
|
||||
1 57600 600 56.2 57600 51474 89
|
||||
3 19200 900 60.5 57600 47815.5 83
|
||||
1 57600 400 60.2 57600 48053.8 83
|
||||
3 19200 600 62 57600 46658.7 81
|
||||
3 19200 400 64.7 57600 44711.6 77
|
||||
1 38400 900 79.4 38400 36433.8 94
|
||||
1 38400 600 82.4 38400 35107.3 91
|
||||
2 19200 900 84.4 38400 34275.4 89
|
||||
1 38400 400 86.8 38400 33327.6 86
|
||||
2 19200 600 87.6 38400 33023.3 85
|
||||
2 19200 400 91.2 38400 31719.7 82
|
||||
4 9600 900 94.7 38400 30547.4 79
|
||||
4 9600 400 106 38400 27290.9 71
|
||||
4 9600 600 110 38400 26298.5 68
|
||||
3 9600 900 118 28800 24515.6 85
|
||||
3 9600 600 120 28800 24107 83
|
||||
3 9600 400 131 28800 22082.7 76
|
||||
1 19200 900 155 19200 18663.5 97
|
||||
1 19200 600 161 19200 17968 93
|
||||
1 19200 400 170 19200 17016.7 88
|
||||
2 9600 600 176 19200 16436.6 85
|
||||
2 9600 900 180 19200 16071.3 83
|
||||
2 9600 400 181 19200 15982.5 83
|
||||
1 9600 900 305 9600 9484.72 98
|
||||
1 9600 600 314 9600 9212.87 95
|
||||
1 9600 400 332 9600 8713.37 90
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
5.2. Anthony Healy's Report
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Date: Mon, 13 Feb 1995 16:17:29 +1100 (EST)
|
||||
From: Antony Healey <ahealey@st.nepean.uws.edu.au>
|
||||
To: Simon Janes <guru@ncm.com>
|
||||
Subject: Re: Load Balancing
|
||||
|
||||
Hi Simon,
|
||||
I've installed your patch and it works great. I have trialed
|
||||
it over twin SL/IP lines, just over null modems, but I was
|
||||
able to data at over 48Kb/s [ISDN link -Simon]. I managed a
|
||||
transfer of up to 7.5 Kbyte/s on one go, but averaged around
|
||||
6.4 Kbyte/s, which I think is pretty cool. :)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
145
Documentation/networking/fib_trie.txt
Normal file
145
Documentation/networking/fib_trie.txt
Normal file
|
|
@ -0,0 +1,145 @@
|
|||
LC-trie implementation notes.
|
||||
|
||||
Node types
|
||||
----------
|
||||
leaf
|
||||
An end node with data. This has a copy of the relevant key, along
|
||||
with 'hlist' with routing table entries sorted by prefix length.
|
||||
See struct leaf and struct leaf_info.
|
||||
|
||||
trie node or tnode
|
||||
An internal node, holding an array of child (leaf or tnode) pointers,
|
||||
indexed through a subset of the key. See Level Compression.
|
||||
|
||||
A few concepts explained
|
||||
------------------------
|
||||
Bits (tnode)
|
||||
The number of bits in the key segment used for indexing into the
|
||||
child array - the "child index". See Level Compression.
|
||||
|
||||
Pos (tnode)
|
||||
The position (in the key) of the key segment used for indexing into
|
||||
the child array. See Path Compression.
|
||||
|
||||
Path Compression / skipped bits
|
||||
Any given tnode is linked to from the child array of its parent, using
|
||||
a segment of the key specified by the parent's "pos" and "bits"
|
||||
In certain cases, this tnode's own "pos" will not be immediately
|
||||
adjacent to the parent (pos+bits), but there will be some bits
|
||||
in the key skipped over because they represent a single path with no
|
||||
deviations. These "skipped bits" constitute Path Compression.
|
||||
Note that the search algorithm will simply skip over these bits when
|
||||
searching, making it necessary to save the keys in the leaves to
|
||||
verify that they actually do match the key we are searching for.
|
||||
|
||||
Level Compression / child arrays
|
||||
the trie is kept level balanced moving, under certain conditions, the
|
||||
children of a full child (see "full_children") up one level, so that
|
||||
instead of a pure binary tree, each internal node ("tnode") may
|
||||
contain an arbitrarily large array of links to several children.
|
||||
Conversely, a tnode with a mostly empty child array (see empty_children)
|
||||
may be "halved", having some of its children moved downwards one level,
|
||||
in order to avoid ever-increasing child arrays.
|
||||
|
||||
empty_children
|
||||
the number of positions in the child array of a given tnode that are
|
||||
NULL.
|
||||
|
||||
full_children
|
||||
the number of children of a given tnode that aren't path compressed.
|
||||
(in other words, they aren't NULL or leaves and their "pos" is equal
|
||||
to this tnode's "pos"+"bits").
|
||||
|
||||
(The word "full" here is used more in the sense of "complete" than
|
||||
as the opposite of "empty", which might be a tad confusing.)
|
||||
|
||||
Comments
|
||||
---------
|
||||
|
||||
We have tried to keep the structure of the code as close to fib_hash as
|
||||
possible to allow verification and help up reviewing.
|
||||
|
||||
fib_find_node()
|
||||
A good start for understanding this code. This function implements a
|
||||
straightforward trie lookup.
|
||||
|
||||
fib_insert_node()
|
||||
Inserts a new leaf node in the trie. This is bit more complicated than
|
||||
fib_find_node(). Inserting a new node means we might have to run the
|
||||
level compression algorithm on part of the trie.
|
||||
|
||||
trie_leaf_remove()
|
||||
Looks up a key, deletes it and runs the level compression algorithm.
|
||||
|
||||
trie_rebalance()
|
||||
The key function for the dynamic trie after any change in the trie
|
||||
it is run to optimize and reorganize. Tt will walk the trie upwards
|
||||
towards the root from a given tnode, doing a resize() at each step
|
||||
to implement level compression.
|
||||
|
||||
resize()
|
||||
Analyzes a tnode and optimizes the child array size by either inflating
|
||||
or shrinking it repeatedly until it fulfills the criteria for optimal
|
||||
level compression. This part follows the original paper pretty closely
|
||||
and there may be some room for experimentation here.
|
||||
|
||||
inflate()
|
||||
Doubles the size of the child array within a tnode. Used by resize().
|
||||
|
||||
halve()
|
||||
Halves the size of the child array within a tnode - the inverse of
|
||||
inflate(). Used by resize();
|
||||
|
||||
fn_trie_insert(), fn_trie_delete(), fn_trie_select_default()
|
||||
The route manipulation functions. Should conform pretty closely to the
|
||||
corresponding functions in fib_hash.
|
||||
|
||||
fn_trie_flush()
|
||||
This walks the full trie (using nextleaf()) and searches for empty
|
||||
leaves which have to be removed.
|
||||
|
||||
fn_trie_dump()
|
||||
Dumps the routing table ordered by prefix length. This is somewhat
|
||||
slower than the corresponding fib_hash function, as we have to walk the
|
||||
entire trie for each prefix length. In comparison, fib_hash is organized
|
||||
as one "zone"/hash per prefix length.
|
||||
|
||||
Locking
|
||||
-------
|
||||
|
||||
fib_lock is used for an RW-lock in the same way that this is done in fib_hash.
|
||||
However, the functions are somewhat separated for other possible locking
|
||||
scenarios. It might conceivably be possible to run trie_rebalance via RCU
|
||||
to avoid read_lock in the fn_trie_lookup() function.
|
||||
|
||||
Main lookup mechanism
|
||||
---------------------
|
||||
fn_trie_lookup() is the main lookup function.
|
||||
|
||||
The lookup is in its simplest form just like fib_find_node(). We descend the
|
||||
trie, key segment by key segment, until we find a leaf. check_leaf() does
|
||||
the fib_semantic_match in the leaf's sorted prefix hlist.
|
||||
|
||||
If we find a match, we are done.
|
||||
|
||||
If we don't find a match, we enter prefix matching mode. The prefix length,
|
||||
starting out at the same as the key length, is reduced one step at a time,
|
||||
and we backtrack upwards through the trie trying to find a longest matching
|
||||
prefix. The goal is always to reach a leaf and get a positive result from the
|
||||
fib_semantic_match mechanism.
|
||||
|
||||
Inside each tnode, the search for longest matching prefix consists of searching
|
||||
through the child array, chopping off (zeroing) the least significant "1" of
|
||||
the child index until we find a match or the child index consists of nothing but
|
||||
zeros.
|
||||
|
||||
At this point we backtrack (t->stats.backtrack++) up the trie, continuing to
|
||||
chop off part of the key in order to find the longest matching prefix.
|
||||
|
||||
At this point we will repeatedly descend subtries to look for a match, and there
|
||||
are some optimizations available that can provide us with "shortcuts" to avoid
|
||||
descending into dead ends. Look for "HL_OPTIMIZE" sections in the code.
|
||||
|
||||
To alleviate any doubts about the correctness of the route selection process,
|
||||
a new netlink operation has been added. Look for NETLINK_FIB_LOOKUP, which
|
||||
gives userland access to fib_lookup().
|
||||
1296
Documentation/networking/filter.txt
Normal file
1296
Documentation/networking/filter.txt
Normal file
File diff suppressed because it is too large
Load diff
64
Documentation/networking/fore200e.txt
Normal file
64
Documentation/networking/fore200e.txt
Normal file
|
|
@ -0,0 +1,64 @@
|
|||
|
||||
FORE Systems PCA-200E/SBA-200E ATM NIC driver
|
||||
---------------------------------------------
|
||||
|
||||
This driver adds support for the FORE Systems 200E-series ATM adapters
|
||||
to the Linux operating system. It is based on the earlier PCA-200E driver
|
||||
written by Uwe Dannowski.
|
||||
|
||||
The driver simultaneously supports PCA-200E and SBA-200E adapters on
|
||||
i386, alpha (untested), powerpc, sparc and sparc64 archs.
|
||||
|
||||
The intent is to enable the use of different models of FORE adapters at the
|
||||
same time, by hosts that have several bus interfaces (such as PCI+SBUS,
|
||||
or PCI+EISA).
|
||||
|
||||
Only PCI and SBUS devices are currently supported by the driver, but support
|
||||
for other bus interfaces such as EISA should not be too hard to add.
|
||||
|
||||
|
||||
Firmware Copyright Notice
|
||||
-------------------------
|
||||
|
||||
Please read the fore200e_firmware_copyright file present
|
||||
in the linux/drivers/atm directory for details and restrictions.
|
||||
|
||||
|
||||
Firmware Updates
|
||||
----------------
|
||||
|
||||
The FORE Systems 200E-series driver is shipped with firmware data being
|
||||
uploaded to the ATM adapters at system boot time or at module loading time.
|
||||
The supplied firmware images should work with all adapters.
|
||||
|
||||
However, if you encounter problems (the firmware doesn't start or the driver
|
||||
is unable to read the PROM data), you may consider trying another firmware
|
||||
version. Alternative binary firmware images can be found somewhere on the
|
||||
ForeThought CD-ROM supplied with your adapter by FORE Systems.
|
||||
|
||||
You can also get the latest firmware images from FORE Systems at
|
||||
http://en.wikipedia.org/wiki/FORE_Systems. Register TACTics Online and go to
|
||||
the 'software updates' pages. The firmware binaries are part of
|
||||
the various ForeThought software distributions.
|
||||
|
||||
Notice that different versions of the PCA-200E firmware exist, depending
|
||||
on the endianness of the host architecture. The driver is shipped with
|
||||
both little and big endian PCA firmware images.
|
||||
|
||||
Name and location of the new firmware images can be set at kernel
|
||||
configuration time:
|
||||
|
||||
1. Copy the new firmware binary files (with .bin, .bin1 or .bin2 suffix)
|
||||
to some directory, such as linux/drivers/atm.
|
||||
|
||||
2. Reconfigure your kernel to set the new firmware name and location.
|
||||
Expected pathnames are absolute or relative to the drivers/atm directory.
|
||||
|
||||
3. Rebuild and re-install your kernel or your module.
|
||||
|
||||
|
||||
Feedback
|
||||
--------
|
||||
|
||||
Feedback is welcome. Please send success stories/bug reports/
|
||||
patches/improvement/comments/flames to <lizzi@cnam.fr>.
|
||||
39
Documentation/networking/framerelay.txt
Normal file
39
Documentation/networking/framerelay.txt
Normal file
|
|
@ -0,0 +1,39 @@
|
|||
Frame Relay (FR) support for linux is built into a two tiered system of device
|
||||
drivers. The upper layer implements RFC1490 FR specification, and uses the
|
||||
Data Link Connection Identifier (DLCI) as its hardware address. Usually these
|
||||
are assigned by your network supplier, they give you the number/numbers of
|
||||
the Virtual Connections (VC) assigned to you.
|
||||
|
||||
Each DLCI is a point-to-point link between your machine and a remote one.
|
||||
As such, a separate device is needed to accommodate the routing. Within the
|
||||
net-tools archives is 'dlcicfg'. This program will communicate with the
|
||||
base "DLCI" device, and create new net devices named 'dlci00', 'dlci01'...
|
||||
The configuration script will ask you how many DLCIs you need, as well as
|
||||
how many DLCIs you want to assign to each Frame Relay Access Device (FRAD).
|
||||
|
||||
The DLCI uses a number of function calls to communicate with the FRAD, all
|
||||
of which are stored in the FRAD's private data area. assoc/deassoc,
|
||||
activate/deactivate and dlci_config. The DLCI supplies a receive function
|
||||
to the FRAD to accept incoming packets.
|
||||
|
||||
With this initial offering, only 1 FRAD driver is available. With many thanks
|
||||
to Sangoma Technologies, David Mandelstam & Gene Kozin, the S502A, S502E &
|
||||
S508 are supported. This driver is currently set up for only FR, but as
|
||||
Sangoma makes more firmware modules available, it can be updated to provide
|
||||
them as well.
|
||||
|
||||
Configuration of the FRAD makes use of another net-tools program, 'fradcfg'.
|
||||
This program makes use of a configuration file (which dlcicfg can also read)
|
||||
to specify the types of boards to be configured as FRADs, as well as perform
|
||||
any board specific configuration. The Sangoma module of fradcfg loads the
|
||||
FR firmware into the card, sets the irq/port/memory information, and provides
|
||||
an initial configuration.
|
||||
|
||||
Additional FRAD device drivers can be added as hardware is available.
|
||||
|
||||
At this time, the dlcicfg and fradcfg programs have not been incorporated into
|
||||
the net-tools distribution. They can be found at ftp.invlogic.com, in
|
||||
/pub/linux. Note that with OS/2 FTPD, you end up in /pub by default, so just
|
||||
use 'cd linux'. v0.10 is for use on pre-2.0.3 and earlier, v0.15 is for
|
||||
pre-2.0.4 and later.
|
||||
|
||||
117
Documentation/networking/gen_stats.txt
Normal file
117
Documentation/networking/gen_stats.txt
Normal file
|
|
@ -0,0 +1,117 @@
|
|||
Generic networking statistics for netlink users
|
||||
======================================================================
|
||||
|
||||
Statistic counters are grouped into structs:
|
||||
|
||||
Struct TLV type Description
|
||||
----------------------------------------------------------------------
|
||||
gnet_stats_basic TCA_STATS_BASIC Basic statistics
|
||||
gnet_stats_rate_est TCA_STATS_RATE_EST Rate estimator
|
||||
gnet_stats_queue TCA_STATS_QUEUE Queue statistics
|
||||
none TCA_STATS_APP Application specific
|
||||
|
||||
|
||||
Collecting:
|
||||
-----------
|
||||
|
||||
Declare the statistic structs you need:
|
||||
struct mystruct {
|
||||
struct gnet_stats_basic bstats;
|
||||
struct gnet_stats_queue qstats;
|
||||
...
|
||||
};
|
||||
|
||||
Update statistics:
|
||||
mystruct->tstats.packet++;
|
||||
mystruct->qstats.backlog += skb->pkt_len;
|
||||
|
||||
|
||||
Export to userspace (Dump):
|
||||
---------------------------
|
||||
|
||||
my_dumping_routine(struct sk_buff *skb, ...)
|
||||
{
|
||||
struct gnet_dump dump;
|
||||
|
||||
if (gnet_stats_start_copy(skb, TCA_STATS2, &mystruct->lock, &dump) < 0)
|
||||
goto rtattr_failure;
|
||||
|
||||
if (gnet_stats_copy_basic(&dump, &mystruct->bstats) < 0 ||
|
||||
gnet_stats_copy_queue(&dump, &mystruct->qstats) < 0 ||
|
||||
gnet_stats_copy_app(&dump, &xstats, sizeof(xstats)) < 0)
|
||||
goto rtattr_failure;
|
||||
|
||||
if (gnet_stats_finish_copy(&dump) < 0)
|
||||
goto rtattr_failure;
|
||||
...
|
||||
}
|
||||
|
||||
TCA_STATS/TCA_XSTATS backward compatibility:
|
||||
--------------------------------------------
|
||||
|
||||
Prior users of struct tc_stats and xstats can maintain backward
|
||||
compatibility by calling the compat wrappers to keep providing the
|
||||
existing TLV types.
|
||||
|
||||
my_dumping_routine(struct sk_buff *skb, ...)
|
||||
{
|
||||
if (gnet_stats_start_copy_compat(skb, TCA_STATS2, TCA_STATS,
|
||||
TCA_XSTATS, &mystruct->lock, &dump) < 0)
|
||||
goto rtattr_failure;
|
||||
...
|
||||
}
|
||||
|
||||
A struct tc_stats will be filled out during gnet_stats_copy_* calls
|
||||
and appended to the skb. TCA_XSTATS is provided if gnet_stats_copy_app
|
||||
was called.
|
||||
|
||||
|
||||
Locking:
|
||||
--------
|
||||
|
||||
Locks are taken before writing and released once all statistics have
|
||||
been written. Locks are always released in case of an error. You
|
||||
are responsible for making sure that the lock is initialized.
|
||||
|
||||
|
||||
Rate Estimator:
|
||||
--------------
|
||||
|
||||
0) Prepare an estimator attribute. Most likely this would be in user
|
||||
space. The value of this TLV should contain a tc_estimator structure.
|
||||
As usual, such a TLV needs to be 32 bit aligned and therefore the
|
||||
length needs to be appropriately set, etc. The estimator interval
|
||||
and ewma log need to be converted to the appropriate values.
|
||||
tc_estimator.c::tc_setup_estimator() is advisable to be used as the
|
||||
conversion routine. It does a few clever things. It takes a time
|
||||
interval in microsecs, a time constant also in microsecs and a struct
|
||||
tc_estimator to be populated. The returned tc_estimator can be
|
||||
transported to the kernel. Transfer such a structure in a TLV of type
|
||||
TCA_RATE to your code in the kernel.
|
||||
|
||||
In the kernel when setting up:
|
||||
1) make sure you have basic stats and rate stats setup first.
|
||||
2) make sure you have initialized stats lock that is used to setup such
|
||||
stats.
|
||||
3) Now initialize a new estimator:
|
||||
|
||||
int ret = gen_new_estimator(my_basicstats,my_rate_est_stats,
|
||||
mystats_lock, attr_with_tcestimator_struct);
|
||||
|
||||
if ret == 0
|
||||
success
|
||||
else
|
||||
failed
|
||||
|
||||
From now on, every time you dump my_rate_est_stats it will contain
|
||||
up-to-date info.
|
||||
|
||||
Once you are done, call gen_kill_estimator(my_basicstats,
|
||||
my_rate_est_stats) Make sure that my_basicstats and my_rate_est_stats
|
||||
are still valid (i.e still exist) at the time of making this call.
|
||||
|
||||
|
||||
Authors:
|
||||
--------
|
||||
Thomas Graf <tgraf@suug.ch>
|
||||
Jamal Hadi Salim <hadi@cyberus.ca>
|
||||
132
Documentation/networking/generic-hdlc.txt
Normal file
132
Documentation/networking/generic-hdlc.txt
Normal file
|
|
@ -0,0 +1,132 @@
|
|||
Generic HDLC layer
|
||||
Krzysztof Halasa <khc@pm.waw.pl>
|
||||
|
||||
|
||||
Generic HDLC layer currently supports:
|
||||
1. Frame Relay (ANSI, CCITT, Cisco and no LMI)
|
||||
- Normal (routed) and Ethernet-bridged (Ethernet device emulation)
|
||||
interfaces can share a single PVC.
|
||||
- ARP support (no InARP support in the kernel - there is an
|
||||
experimental InARP user-space daemon available on:
|
||||
http://www.kernel.org/pub/linux/utils/net/hdlc/).
|
||||
2. raw HDLC - either IP (IPv4) interface or Ethernet device emulation
|
||||
3. Cisco HDLC
|
||||
4. PPP
|
||||
5. X.25 (uses X.25 routines).
|
||||
|
||||
Generic HDLC is a protocol driver only - it needs a low-level driver
|
||||
for your particular hardware.
|
||||
|
||||
Ethernet device emulation (using HDLC or Frame-Relay PVC) is compatible
|
||||
with IEEE 802.1Q (VLANs) and 802.1D (Ethernet bridging).
|
||||
|
||||
|
||||
Make sure the hdlc.o and the hardware driver are loaded. It should
|
||||
create a number of "hdlc" (hdlc0 etc) network devices, one for each
|
||||
WAN port. You'll need the "sethdlc" utility, get it from:
|
||||
http://www.kernel.org/pub/linux/utils/net/hdlc/
|
||||
|
||||
Compile sethdlc.c utility:
|
||||
gcc -O2 -Wall -o sethdlc sethdlc.c
|
||||
Make sure you're using a correct version of sethdlc for your kernel.
|
||||
|
||||
Use sethdlc to set physical interface, clock rate, HDLC mode used,
|
||||
and add any required PVCs if using Frame Relay.
|
||||
Usually you want something like:
|
||||
|
||||
sethdlc hdlc0 clock int rate 128000
|
||||
sethdlc hdlc0 cisco interval 10 timeout 25
|
||||
or
|
||||
sethdlc hdlc0 rs232 clock ext
|
||||
sethdlc hdlc0 fr lmi ansi
|
||||
sethdlc hdlc0 create 99
|
||||
ifconfig hdlc0 up
|
||||
ifconfig pvc0 localIP pointopoint remoteIP
|
||||
|
||||
In Frame Relay mode, ifconfig master hdlc device up (without assigning
|
||||
any IP address to it) before using pvc devices.
|
||||
|
||||
|
||||
Setting interface:
|
||||
|
||||
* v35 | rs232 | x21 | t1 | e1 - sets physical interface for a given port
|
||||
if the card has software-selectable interfaces
|
||||
loopback - activate hardware loopback (for testing only)
|
||||
* clock ext - both RX clock and TX clock external
|
||||
* clock int - both RX clock and TX clock internal
|
||||
* clock txint - RX clock external, TX clock internal
|
||||
* clock txfromrx - RX clock external, TX clock derived from RX clock
|
||||
* rate - sets clock rate in bps (for "int" or "txint" clock only)
|
||||
|
||||
|
||||
Setting protocol:
|
||||
|
||||
* hdlc - sets raw HDLC (IP-only) mode
|
||||
nrz / nrzi / fm-mark / fm-space / manchester - sets transmission code
|
||||
no-parity / crc16 / crc16-pr0 (CRC16 with preset zeros) / crc32-itu
|
||||
crc16-itu (CRC16 with ITU-T polynomial) / crc16-itu-pr0 - sets parity
|
||||
|
||||
* hdlc-eth - Ethernet device emulation using HDLC. Parity and encoding
|
||||
as above.
|
||||
|
||||
* cisco - sets Cisco HDLC mode (IP, IPv6 and IPX supported)
|
||||
interval - time in seconds between keepalive packets
|
||||
timeout - time in seconds after last received keepalive packet before
|
||||
we assume the link is down
|
||||
|
||||
* ppp - sets synchronous PPP mode
|
||||
|
||||
* x25 - sets X.25 mode
|
||||
|
||||
* fr - Frame Relay mode
|
||||
lmi ansi / ccitt / cisco / none - LMI (link management) type
|
||||
dce - Frame Relay DCE (network) side LMI instead of default DTE (user).
|
||||
It has nothing to do with clocks!
|
||||
t391 - link integrity verification polling timer (in seconds) - user
|
||||
t392 - polling verification timer (in seconds) - network
|
||||
n391 - full status polling counter - user
|
||||
n392 - error threshold - both user and network
|
||||
n393 - monitored events count - both user and network
|
||||
|
||||
Frame-Relay only:
|
||||
* create n | delete n - adds / deletes PVC interface with DLCI #n.
|
||||
Newly created interface will be named pvc0, pvc1 etc.
|
||||
|
||||
* create ether n | delete ether n - adds a device for Ethernet-bridged
|
||||
frames. The device will be named pvceth0, pvceth1 etc.
|
||||
|
||||
|
||||
|
||||
|
||||
Board-specific issues
|
||||
---------------------
|
||||
|
||||
n2.o and c101.o need parameters to work:
|
||||
|
||||
insmod n2 hw=io,irq,ram,ports[:io,irq,...]
|
||||
example:
|
||||
insmod n2 hw=0x300,10,0xD0000,01
|
||||
|
||||
or
|
||||
insmod c101 hw=irq,ram[:irq,...]
|
||||
example:
|
||||
insmod c101 hw=9,0xdc000
|
||||
|
||||
If built into the kernel, these drivers need kernel (command line) parameters:
|
||||
n2.hw=io,irq,ram,ports:...
|
||||
or
|
||||
c101.hw=irq,ram:...
|
||||
|
||||
|
||||
|
||||
If you have a problem with N2, C101 or PLX200SYN card, you can issue the
|
||||
"private" command to see port's packet descriptor rings (in kernel logs):
|
||||
|
||||
sethdlc hdlc0 private
|
||||
|
||||
The hardware driver has to be build with #define DEBUG_RINGS.
|
||||
Attaching this info to bug reports would be helpful. Anyway, let me know
|
||||
if you have problems using this.
|
||||
|
||||
For patches and other info look at:
|
||||
<http://www.kernel.org/pub/linux/utils/net/hdlc/>.
|
||||
3
Documentation/networking/generic_netlink.txt
Normal file
3
Documentation/networking/generic_netlink.txt
Normal file
|
|
@ -0,0 +1,3 @@
|
|||
A wiki document on how to use Generic Netlink can be found here:
|
||||
|
||||
* http://www.linuxfoundation.org/collaborate/workgroups/networking/generic_netlink_howto
|
||||
42
Documentation/networking/gianfar.txt
Normal file
42
Documentation/networking/gianfar.txt
Normal file
|
|
@ -0,0 +1,42 @@
|
|||
The Gianfar Ethernet Driver
|
||||
|
||||
Author: Andy Fleming <afleming@freescale.com>
|
||||
Updated: 2005-07-28
|
||||
|
||||
|
||||
CHECKSUM OFFLOADING
|
||||
|
||||
The eTSEC controller (first included in parts from late 2005 like
|
||||
the 8548) has the ability to perform TCP, UDP, and IP checksums
|
||||
in hardware. The Linux kernel only offloads the TCP and UDP
|
||||
checksums (and always performs the pseudo header checksums), so
|
||||
the driver only supports checksumming for TCP/IP and UDP/IP
|
||||
packets. Use ethtool to enable or disable this feature for RX
|
||||
and TX.
|
||||
|
||||
VLAN
|
||||
|
||||
In order to use VLAN, please consult Linux documentation on
|
||||
configuring VLANs. The gianfar driver supports hardware insertion and
|
||||
extraction of VLAN headers, but not filtering. Filtering will be
|
||||
done by the kernel.
|
||||
|
||||
MULTICASTING
|
||||
|
||||
The gianfar driver supports using the group hash table on the
|
||||
TSEC (and the extended hash table on the eTSEC) for multicast
|
||||
filtering. On the eTSEC, the exact-match MAC registers are used
|
||||
before the hash tables. See Linux documentation on how to join
|
||||
multicast groups.
|
||||
|
||||
PADDING
|
||||
|
||||
The gianfar driver supports padding received frames with 2 bytes
|
||||
to align the IP header to a 16-byte boundary, when supported by
|
||||
hardware.
|
||||
|
||||
ETHTOOL
|
||||
|
||||
The gianfar driver supports the use of ethtool for many
|
||||
configuration options. You must run ethtool only on currently
|
||||
open interfaces. See ethtool documentation for details.
|
||||
118
Documentation/networking/i40e.txt
Normal file
118
Documentation/networking/i40e.txt
Normal file
|
|
@ -0,0 +1,118 @@
|
|||
Linux Base Driver for the Intel(R) Ethernet Controller XL710 Family
|
||||
===================================================================
|
||||
|
||||
Intel i40e Linux driver.
|
||||
Copyright(c) 2013 Intel Corporation.
|
||||
|
||||
Contents
|
||||
========
|
||||
|
||||
- Identifying Your Adapter
|
||||
- Additional Configurations
|
||||
- Performance Tuning
|
||||
- Known Issues
|
||||
- Support
|
||||
|
||||
|
||||
Identifying Your Adapter
|
||||
========================
|
||||
|
||||
The driver in this release is compatible with the Intel Ethernet
|
||||
Controller XL710 Family.
|
||||
|
||||
For more information on how to identify your adapter, go to the Adapter &
|
||||
Driver ID Guide at:
|
||||
|
||||
http://support.intel.com/support/network/sb/CS-012904.htm
|
||||
|
||||
|
||||
Enabling the driver
|
||||
===================
|
||||
|
||||
The driver is enabled via the standard kernel configuration system,
|
||||
using the make command:
|
||||
|
||||
Make oldconfig/silentoldconfig/menuconfig/etc.
|
||||
|
||||
The driver is located in the menu structure at:
|
||||
|
||||
-> Device Drivers
|
||||
-> Network device support (NETDEVICES [=y])
|
||||
-> Ethernet driver support
|
||||
-> Intel devices
|
||||
-> Intel(R) Ethernet Controller XL710 Family
|
||||
|
||||
Additional Configurations
|
||||
=========================
|
||||
|
||||
Generic Receive Offload (GRO)
|
||||
-----------------------------
|
||||
The driver supports the in-kernel software implementation of GRO. GRO has
|
||||
shown that by coalescing Rx traffic into larger chunks of data, CPU
|
||||
utilization can be significantly reduced when under large Rx load. GRO is
|
||||
an evolution of the previously-used LRO interface. GRO is able to coalesce
|
||||
other protocols besides TCP. It's also safe to use with configurations that
|
||||
are problematic for LRO, namely bridging and iSCSI.
|
||||
|
||||
Ethtool
|
||||
-------
|
||||
The driver utilizes the ethtool interface for driver configuration and
|
||||
diagnostics, as well as displaying statistical information. The latest
|
||||
ethtool version is required for this functionality.
|
||||
|
||||
The latest release of ethtool can be found from
|
||||
https://www.kernel.org/pub/software/network/ethtool
|
||||
|
||||
Data Center Bridging (DCB)
|
||||
--------------------------
|
||||
DCB configuration is not currently supported.
|
||||
|
||||
FCoE
|
||||
----
|
||||
The driver supports Fiber Channel over Ethernet (FCoE) and Data Center
|
||||
Bridging (DCB) functionality. Configuring DCB and FCoE is outside the scope
|
||||
of this driver doc. Refer to http://www.open-fcoe.org/ for FCoE project
|
||||
information and http://www.open-lldp.org/ or email list
|
||||
e1000-eedc@lists.sourceforge.net for DCB information.
|
||||
|
||||
MAC and VLAN anti-spoofing feature
|
||||
----------------------------------
|
||||
When a malicious driver attempts to send a spoofed packet, it is dropped by
|
||||
the hardware and not transmitted. An interrupt is sent to the PF driver
|
||||
notifying it of the spoof attempt.
|
||||
|
||||
When a spoofed packet is detected the PF driver will send the following
|
||||
message to the system log (displayed by the "dmesg" command):
|
||||
|
||||
Spoof event(s) detected on VF (n)
|
||||
|
||||
Where n=the VF that attempted to do the spoofing.
|
||||
|
||||
|
||||
Performance Tuning
|
||||
==================
|
||||
|
||||
An excellent article on performance tuning can be found at:
|
||||
|
||||
http://www.redhat.com/promo/summit/2008/downloads/pdf/Thursday/Mark_Wagner.pdf
|
||||
|
||||
|
||||
Known Issues
|
||||
============
|
||||
|
||||
|
||||
Support
|
||||
=======
|
||||
|
||||
For general information, go to the Intel support website at:
|
||||
|
||||
http://support.intel.com
|
||||
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
|
||||
http://e1000.sourceforge.net
|
||||
|
||||
If an issue is identified with the released source code on the supported
|
||||
kernel with a supported adapter, email the specific information related
|
||||
to the issue to e1000-devel@lists.sourceforge.net and copy
|
||||
netdev@vger.kernel.org.
|
||||
47
Documentation/networking/i40evf.txt
Normal file
47
Documentation/networking/i40evf.txt
Normal file
|
|
@ -0,0 +1,47 @@
|
|||
Linux* Base Driver for Intel(R) Network Connection
|
||||
==================================================
|
||||
|
||||
Intel XL710 X710 Virtual Function Linux driver.
|
||||
Copyright(c) 2013 Intel Corporation.
|
||||
|
||||
Contents
|
||||
========
|
||||
|
||||
- Identifying Your Adapter
|
||||
- Known Issues/Troubleshooting
|
||||
- Support
|
||||
|
||||
This file describes the i40evf Linux* Base Driver for the Intel(R) XL710
|
||||
X710 Virtual Function.
|
||||
|
||||
The i40evf driver supports XL710 and X710 virtual function devices that
|
||||
can only be activated on kernels with CONFIG_PCI_IOV enabled.
|
||||
|
||||
The guest OS loading the i40evf driver must support MSI-X interrupts.
|
||||
|
||||
Identifying Your Adapter
|
||||
========================
|
||||
|
||||
For more information on how to identify your adapter, go to the Adapter &
|
||||
Driver ID Guide at:
|
||||
|
||||
http://support.intel.com/support/go/network/adapter/idguide.htm
|
||||
|
||||
Known Issues/Troubleshooting
|
||||
============================
|
||||
|
||||
|
||||
Support
|
||||
=======
|
||||
|
||||
For general information, go to the Intel support website at:
|
||||
|
||||
http://support.intel.com
|
||||
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
|
||||
http://sourceforge.net/projects/e1000
|
||||
|
||||
If an issue is identified with the released source code on the supported
|
||||
kernel with a supported adapter, email the specific information related
|
||||
to the issue to e1000-devel@lists.sf.net
|
||||
151
Documentation/networking/ieee802154.txt
Normal file
151
Documentation/networking/ieee802154.txt
Normal file
|
|
@ -0,0 +1,151 @@
|
|||
|
||||
Linux IEEE 802.15.4 implementation
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
The IEEE 802.15.4 working group focuses on standardization of bottom
|
||||
two layers: Medium Access Control (MAC) and Physical (PHY). And there
|
||||
are mainly two options available for upper layers:
|
||||
- ZigBee - proprietary protocol from ZigBee Alliance
|
||||
- 6LowPAN - IPv6 networking over low rate personal area networks
|
||||
|
||||
The Linux-ZigBee project goal is to provide complete implementation
|
||||
of IEEE 802.15.4 and 6LoWPAN protocols. IEEE 802.15.4 is a stack
|
||||
of protocols for organizing Low-Rate Wireless Personal Area Networks.
|
||||
|
||||
The stack is composed of three main parts:
|
||||
- IEEE 802.15.4 layer; We have chosen to use plain Berkeley socket API,
|
||||
the generic Linux networking stack to transfer IEEE 802.15.4 messages
|
||||
and a special protocol over genetlink for configuration/management
|
||||
- MAC - provides access to shared channel and reliable data delivery
|
||||
- PHY - represents device drivers
|
||||
|
||||
|
||||
Socket API
|
||||
==========
|
||||
|
||||
int sd = socket(PF_IEEE802154, SOCK_DGRAM, 0);
|
||||
.....
|
||||
|
||||
The address family, socket addresses etc. are defined in the
|
||||
include/net/af_ieee802154.h header or in the special header
|
||||
in our userspace package (see either linux-zigbee sourceforge download page
|
||||
or git tree at git://linux-zigbee.git.sourceforge.net/gitroot/linux-zigbee).
|
||||
|
||||
One can use SOCK_RAW for passing raw data towards device xmit function. YMMV.
|
||||
|
||||
|
||||
Kernel side
|
||||
=============
|
||||
|
||||
Like with WiFi, there are several types of devices implementing IEEE 802.15.4.
|
||||
1) 'HardMAC'. The MAC layer is implemented in the device itself, the device
|
||||
exports MLME and data API.
|
||||
2) 'SoftMAC' or just radio. These types of devices are just radio transceivers
|
||||
possibly with some kinds of acceleration like automatic CRC computation and
|
||||
comparation, automagic ACK handling, address matching, etc.
|
||||
|
||||
Those types of devices require different approach to be hooked into Linux kernel.
|
||||
|
||||
|
||||
MLME - MAC Level Management
|
||||
============================
|
||||
|
||||
Most of IEEE 802.15.4 MLME interfaces are directly mapped on netlink commands.
|
||||
See the include/net/nl802154.h header. Our userspace tools package
|
||||
(see above) provides CLI configuration utility for radio interfaces and simple
|
||||
coordinator for IEEE 802.15.4 networks as an example users of MLME protocol.
|
||||
|
||||
|
||||
HardMAC
|
||||
=======
|
||||
|
||||
See the header include/net/ieee802154_netdev.h. You have to implement Linux
|
||||
net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family
|
||||
code via plain sk_buffs. On skb reception skb->cb must contain additional
|
||||
info as described in the struct ieee802154_mac_cb. During packet transmission
|
||||
the skb->cb is used to provide additional data to device's header_ops->create
|
||||
function. Be aware that this data can be overridden later (when socket code
|
||||
submits skb to qdisc), so if you need something from that cb later, you should
|
||||
store info in the skb->data on your own.
|
||||
|
||||
To hook the MLME interface you have to populate the ml_priv field of your
|
||||
net_device with a pointer to struct ieee802154_mlme_ops instance. The fields
|
||||
assoc_req, assoc_resp, disassoc_req, start_req, and scan_req are optional.
|
||||
All other fields are required.
|
||||
|
||||
We provide an example of simple HardMAC driver at drivers/ieee802154/fakehard.c
|
||||
|
||||
|
||||
SoftMAC
|
||||
=======
|
||||
|
||||
The MAC is the middle layer in the IEEE 802.15.4 Linux stack. This moment it
|
||||
provides interface for drivers registration and management of slave interfaces.
|
||||
|
||||
NOTE: Currently the only monitor device type is supported - it's IEEE 802.15.4
|
||||
stack interface for network sniffers (e.g. WireShark).
|
||||
|
||||
This layer is going to be extended soon.
|
||||
|
||||
See header include/net/mac802154.h and several drivers in drivers/ieee802154/.
|
||||
|
||||
|
||||
Device drivers API
|
||||
==================
|
||||
|
||||
The include/net/mac802154.h defines following functions:
|
||||
- struct ieee802154_dev *ieee802154_alloc_device
|
||||
(size_t priv_size, struct ieee802154_ops *ops):
|
||||
allocation of IEEE 802.15.4 compatible device
|
||||
|
||||
- void ieee802154_free_device(struct ieee802154_dev *dev):
|
||||
freeing allocated device
|
||||
|
||||
- int ieee802154_register_device(struct ieee802154_dev *dev):
|
||||
register PHY in the system
|
||||
|
||||
- void ieee802154_unregister_device(struct ieee802154_dev *dev):
|
||||
freeing registered PHY
|
||||
|
||||
Moreover IEEE 802.15.4 device operations structure should be filled.
|
||||
|
||||
Fake drivers
|
||||
============
|
||||
|
||||
In addition there are two drivers available which simulate real devices with
|
||||
HardMAC (fakehard) and SoftMAC (fakelb - IEEE 802.15.4 loopback driver)
|
||||
interfaces. This option provides possibility to test and debug stack without
|
||||
usage of real hardware.
|
||||
|
||||
See sources in drivers/ieee802154 folder for more details.
|
||||
|
||||
|
||||
6LoWPAN Linux implementation
|
||||
============================
|
||||
|
||||
The IEEE 802.15.4 standard specifies an MTU of 128 bytes, yielding about 80
|
||||
octets of actual MAC payload once security is turned on, on a wireless link
|
||||
with a link throughput of 250 kbps or less. The 6LoWPAN adaptation format
|
||||
[RFC4944] was specified to carry IPv6 datagrams over such constrained links,
|
||||
taking into account limited bandwidth, memory, or energy resources that are
|
||||
expected in applications such as wireless Sensor Networks. [RFC4944] defines
|
||||
a Mesh Addressing header to support sub-IP forwarding, a Fragmentation header
|
||||
to support the IPv6 minimum MTU requirement [RFC2460], and stateless header
|
||||
compression for IPv6 datagrams (LOWPAN_HC1 and LOWPAN_HC2) to reduce the
|
||||
relatively large IPv6 and UDP headers down to (in the best case) several bytes.
|
||||
|
||||
In Semptember 2011 the standard update was published - [RFC6282].
|
||||
It deprecates HC1 and HC2 compression and defines IPHC encoding format which is
|
||||
used in this Linux implementation.
|
||||
|
||||
All the code related to 6lowpan you may find in files: net/ieee802154/6lowpan.*
|
||||
|
||||
To setup 6lowpan interface you need (busybox release > 1.17.0):
|
||||
1. Add IEEE802.15.4 interface and initialize PANid;
|
||||
2. Add 6lowpan interface by command like:
|
||||
# ip link add link wpan0 name lowpan0 type lowpan
|
||||
3. Set MAC (if needs):
|
||||
# ip link set lowpan0 address de:ad:be:ef:ca:fe:ba:be
|
||||
4. Bring up 'lowpan0' interface
|
||||
129
Documentation/networking/igb.txt
Normal file
129
Documentation/networking/igb.txt
Normal file
|
|
@ -0,0 +1,129 @@
|
|||
Linux* Base Driver for Intel(R) Ethernet Network Connection
|
||||
===========================================================
|
||||
|
||||
Intel Gigabit Linux driver.
|
||||
Copyright(c) 1999 - 2013 Intel Corporation.
|
||||
|
||||
Contents
|
||||
========
|
||||
|
||||
- Identifying Your Adapter
|
||||
- Additional Configurations
|
||||
- Support
|
||||
|
||||
Identifying Your Adapter
|
||||
========================
|
||||
|
||||
This driver supports all 82575, 82576 and 82580-based Intel (R) gigabit network
|
||||
connections.
|
||||
|
||||
For specific information on how to identify your adapter, go to the Adapter &
|
||||
Driver ID Guide at:
|
||||
|
||||
http://support.intel.com/support/go/network/adapter/idguide.htm
|
||||
|
||||
Command Line Parameters
|
||||
=======================
|
||||
|
||||
The default value for each parameter is generally the recommended setting,
|
||||
unless otherwise noted.
|
||||
|
||||
max_vfs
|
||||
-------
|
||||
Valid Range: 0-7
|
||||
Default Value: 0
|
||||
|
||||
This parameter adds support for SR-IOV. It causes the driver to spawn up to
|
||||
max_vfs worth of virtual function.
|
||||
|
||||
Additional Configurations
|
||||
=========================
|
||||
|
||||
Jumbo Frames
|
||||
------------
|
||||
Jumbo Frames support is enabled by changing the MTU to a value larger than
|
||||
the default of 1500. Use the ifconfig command to increase the MTU size.
|
||||
For example:
|
||||
|
||||
ifconfig eth<x> mtu 9000 up
|
||||
|
||||
This setting is not saved across reboots.
|
||||
|
||||
Notes:
|
||||
|
||||
- The maximum MTU setting for Jumbo Frames is 9216. This value coincides
|
||||
with the maximum Jumbo Frames size of 9234 bytes.
|
||||
|
||||
- Using Jumbo frames at 10 or 100 Mbps is not supported and may result in
|
||||
poor performance or loss of link.
|
||||
|
||||
ethtool
|
||||
-------
|
||||
The driver utilizes the ethtool interface for driver configuration and
|
||||
diagnostics, as well as displaying statistical information. The latest
|
||||
version of ethtool can be found at:
|
||||
|
||||
http://ftp.kernel.org/pub/software/network/ethtool/
|
||||
|
||||
Enabling Wake on LAN* (WoL)
|
||||
---------------------------
|
||||
WoL is configured through the ethtool* utility.
|
||||
|
||||
For instructions on enabling WoL with ethtool, refer to the ethtool man page.
|
||||
|
||||
WoL will be enabled on the system during the next shut down or reboot.
|
||||
For this driver version, in order to enable WoL, the igb driver must be
|
||||
loaded when shutting down or rebooting the system.
|
||||
|
||||
Wake On LAN is only supported on port A of multi-port adapters.
|
||||
|
||||
Wake On LAN is not supported for the Intel(R) Gigabit VT Quad Port Server
|
||||
Adapter.
|
||||
|
||||
Multiqueue
|
||||
----------
|
||||
In this mode, a separate MSI-X vector is allocated for each queue and one
|
||||
for "other" interrupts such as link status change and errors. All
|
||||
interrupts are throttled via interrupt moderation. Interrupt moderation
|
||||
must be used to avoid interrupt storms while the driver is processing one
|
||||
interrupt. The moderation value should be at least as large as the expected
|
||||
time for the driver to process an interrupt. Multiqueue is off by default.
|
||||
|
||||
REQUIREMENTS: MSI-X support is required for Multiqueue. If MSI-X is not
|
||||
found, the system will fallback to MSI or to Legacy interrupts.
|
||||
|
||||
MAC and VLAN anti-spoofing feature
|
||||
----------------------------------
|
||||
When a malicious driver attempts to send a spoofed packet, it is dropped by
|
||||
the hardware and not transmitted. An interrupt is sent to the PF driver
|
||||
notifying it of the spoof attempt.
|
||||
|
||||
When a spoofed packet is detected the PF driver will send the following
|
||||
message to the system log (displayed by the "dmesg" command):
|
||||
|
||||
Spoof event(s) detected on VF(n)
|
||||
|
||||
Where n=the VF that attempted to do the spoofing.
|
||||
|
||||
Setting MAC Address, VLAN and Rate Limit Using IProute2 Tool
|
||||
------------------------------------------------------------
|
||||
You can set a MAC address of a Virtual Function (VF), a default VLAN and the
|
||||
rate limit using the IProute2 tool. Download the latest version of the
|
||||
iproute2 tool from Sourceforge if your version does not have all the
|
||||
features you require.
|
||||
|
||||
|
||||
Support
|
||||
=======
|
||||
|
||||
For general information, go to the Intel support website at:
|
||||
|
||||
www.intel.com/support/
|
||||
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
|
||||
http://sourceforge.net/projects/e1000
|
||||
|
||||
If an issue is identified with the released source code on the supported
|
||||
kernel with a supported adapter, email the specific information related
|
||||
to the issue to e1000-devel@lists.sf.net
|
||||
80
Documentation/networking/igbvf.txt
Normal file
80
Documentation/networking/igbvf.txt
Normal file
|
|
@ -0,0 +1,80 @@
|
|||
Linux* Base Driver for Intel(R) Ethernet Network Connection
|
||||
===========================================================
|
||||
|
||||
Intel Gigabit Linux driver.
|
||||
Copyright(c) 1999 - 2013 Intel Corporation.
|
||||
|
||||
Contents
|
||||
========
|
||||
|
||||
- Identifying Your Adapter
|
||||
- Additional Configurations
|
||||
- Support
|
||||
|
||||
This file describes the igbvf Linux* Base Driver for Intel Network Connection.
|
||||
|
||||
The igbvf driver supports 82576-based virtual function devices that can only
|
||||
be activated on kernels that support SR-IOV. SR-IOV requires the correct
|
||||
platform and OS support.
|
||||
|
||||
The igbvf driver requires the igb driver, version 2.0 or later. The igbvf
|
||||
driver supports virtual functions generated by the igb driver with a max_vfs
|
||||
value of 1 or greater. For more information on the max_vfs parameter refer
|
||||
to the README included with the igb driver.
|
||||
|
||||
The guest OS loading the igbvf driver must support MSI-X interrupts.
|
||||
|
||||
This driver is only supported as a loadable module at this time. Intel is
|
||||
not supplying patches against the kernel source to allow for static linking
|
||||
of the driver. For questions related to hardware requirements, refer to the
|
||||
documentation supplied with your Intel Gigabit adapter. All hardware
|
||||
requirements listed apply to use with Linux.
|
||||
|
||||
Instructions on updating ethtool can be found in the section "Additional
|
||||
Configurations" later in this document.
|
||||
|
||||
VLANs: There is a limit of a total of 32 shared VLANs to 1 or more VFs.
|
||||
|
||||
Identifying Your Adapter
|
||||
========================
|
||||
|
||||
The igbvf driver supports 82576-based virtual function devices that can only
|
||||
be activated on kernels that support SR-IOV.
|
||||
|
||||
For more information on how to identify your adapter, go to the Adapter &
|
||||
Driver ID Guide at:
|
||||
|
||||
http://support.intel.com/support/go/network/adapter/idguide.htm
|
||||
|
||||
For the latest Intel network drivers for Linux, refer to the following
|
||||
website. In the search field, enter your adapter name or type, or use the
|
||||
networking link on the left to search for your adapter:
|
||||
|
||||
http://downloadcenter.intel.com/scripts-df-external/Support_Intel.aspx
|
||||
|
||||
Additional Configurations
|
||||
=========================
|
||||
|
||||
ethtool
|
||||
-------
|
||||
The driver utilizes the ethtool interface for driver configuration and
|
||||
diagnostics, as well as displaying statistical information. The ethtool
|
||||
version 3.0 or later is required for this functionality, although we
|
||||
strongly recommend downloading the latest version at:
|
||||
|
||||
http://ftp.kernel.org/pub/software/network/ethtool/
|
||||
|
||||
Support
|
||||
=======
|
||||
|
||||
For general information, go to the Intel support website at:
|
||||
|
||||
http://support.intel.com
|
||||
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
|
||||
http://sourceforge.net/projects/e1000
|
||||
|
||||
If an issue is identified with the released source code on the supported
|
||||
kernel with a supported adapter, email the specific information related
|
||||
to the issue to e1000-devel@lists.sf.net
|
||||
1793
Documentation/networking/ip-sysctl.txt
Normal file
1793
Documentation/networking/ip-sysctl.txt
Normal file
File diff suppressed because it is too large
Load diff
29
Documentation/networking/ip_dynaddr.txt
Normal file
29
Documentation/networking/ip_dynaddr.txt
Normal file
|
|
@ -0,0 +1,29 @@
|
|||
IP dynamic address hack-port v0.03
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
This stuff allows diald ONESHOT connections to get established by
|
||||
dynamically changing packet source address (and socket's if local procs).
|
||||
It is implemented for TCP diald-box connections(1) and IP_MASQuerading(2).
|
||||
|
||||
If enabled[*] and forwarding interface has changed:
|
||||
1) Socket (and packet) source address is rewritten ON RETRANSMISSIONS
|
||||
while in SYN_SENT state (diald-box processes).
|
||||
2) Out-bounded MASQueraded source address changes ON OUTPUT (when
|
||||
internal host does retransmission) until a packet from outside is
|
||||
received by the tunnel.
|
||||
|
||||
This is specially helpful for auto dialup links (diald), where the
|
||||
``actual'' outgoing address is unknown at the moment the link is
|
||||
going up. So, the *same* (local AND masqueraded) connections requests that
|
||||
bring the link up will be able to get established.
|
||||
|
||||
[*] At boot, by default no address rewriting is attempted.
|
||||
To enable:
|
||||
# echo 1 > /proc/sys/net/ipv4/ip_dynaddr
|
||||
To enable verbose mode:
|
||||
# echo 2 > /proc/sys/net/ipv4/ip_dynaddr
|
||||
To disable (default)
|
||||
# echo 0 > /proc/sys/net/ipv4/ip_dynaddr
|
||||
|
||||
Enjoy!
|
||||
|
||||
-- Juanjo <jjciarla@raiz.uncu.edu.ar>
|
||||
73
Documentation/networking/ipddp.txt
Normal file
73
Documentation/networking/ipddp.txt
Normal file
|
|
@ -0,0 +1,73 @@
|
|||
Text file for ipddp.c:
|
||||
AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation
|
||||
|
||||
This text file is written by Jay Schulist <jschlst@samba.org>
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
AppleTalk-IP (IPDDP) is the method computers connected to AppleTalk
|
||||
networks can use to communicate via IP. AppleTalk-IP is simply IP datagrams
|
||||
inside AppleTalk packets.
|
||||
|
||||
Through this driver you can either allow your Linux box to communicate
|
||||
IP over an AppleTalk network or you can provide IP gatewaying functions
|
||||
for your AppleTalk users.
|
||||
|
||||
You can currently encapsulate or decapsulate AppleTalk-IP on LocalTalk,
|
||||
EtherTalk and PPPTalk. The only limit on the protocol is that of what
|
||||
kernel AppleTalk layer and drivers are available.
|
||||
|
||||
Each mode requires its own user space software.
|
||||
|
||||
Compiling AppleTalk-IP Decapsulation/Encapsulation
|
||||
=================================================
|
||||
|
||||
AppleTalk-IP decapsulation needs to be compiled into your kernel. You
|
||||
will need to turn on AppleTalk-IP driver support. Then you will need to
|
||||
select ONE of the two options; IP to AppleTalk-IP encapsulation support or
|
||||
AppleTalk-IP to IP decapsulation support. If you compile the driver
|
||||
statically you will only be able to use the driver for the function you have
|
||||
enabled in the kernel. If you compile the driver as a module you can
|
||||
select what mode you want it to run in via a module loading param.
|
||||
ipddp_mode=1 for AppleTalk-IP encapsulation and ipddp_mode=2 for
|
||||
AppleTalk-IP to IP decapsulation.
|
||||
|
||||
Basic instructions for user space tools
|
||||
=======================================
|
||||
|
||||
I will briefly describe the operation of the tools, but you will
|
||||
need to consult the supporting documentation for each set of tools.
|
||||
|
||||
Decapsulation - You will need to download a software package called
|
||||
MacGate. In this distribution there will be a tool called MacRoute
|
||||
which enables you to add routes to the kernel for your Macs by hand.
|
||||
Also the tool MacRegGateWay is included to register the
|
||||
proper IP Gateway and IP addresses for your machine. Included in this
|
||||
distribution is a patch to netatalk-1.4b2+asun2.0a17.2 (available from
|
||||
ftp.u.washington.edu/pub/user-supported/asun/) this patch is optional
|
||||
but it allows automatic adding and deleting of routes for Macs. (Handy
|
||||
for locations with large Mac installations)
|
||||
|
||||
Encapsulation - You will need to download a software daemon called ipddpd.
|
||||
This software expects there to be an AppleTalk-IP gateway on the network.
|
||||
You will also need to add the proper routes to route your Linux box's IP
|
||||
traffic out the ipddp interface.
|
||||
|
||||
Common Uses of ipddp.c
|
||||
----------------------
|
||||
Of course AppleTalk-IP decapsulation and encapsulation, but specifically
|
||||
decapsulation is being used most for connecting LocalTalk networks to
|
||||
IP networks. Although it has been used on EtherTalk networks to allow
|
||||
Macs that are only able to tunnel IP over EtherTalk.
|
||||
|
||||
Encapsulation has been used to allow a Linux box stuck on a LocalTalk
|
||||
network to use IP. It should work equally well if you are stuck on an
|
||||
EtherTalk only network.
|
||||
|
||||
Further Assistance
|
||||
-------------------
|
||||
You can contact me (Jay Schulist <jschlst@samba.org>) with any
|
||||
questions regarding decapsulation or encapsulation. Bradford W. Johnson
|
||||
<johns393@maroon.tc.umn.edu> originally wrote the ipddp.c driver for IP
|
||||
encapsulation in AppleTalk.
|
||||
158
Documentation/networking/iphase.txt
Normal file
158
Documentation/networking/iphase.txt
Normal file
|
|
@ -0,0 +1,158 @@
|
|||
|
||||
READ ME FISRT
|
||||
ATM (i)Chip IA Linux Driver Source
|
||||
--------------------------------------------------------------------------------
|
||||
Read This Before You Begin!
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Description
|
||||
-----------
|
||||
|
||||
This is the README file for the Interphase PCI ATM (i)Chip IA Linux driver
|
||||
source release.
|
||||
|
||||
The features and limitations of this driver are as follows:
|
||||
- A single VPI (VPI value of 0) is supported.
|
||||
- Supports 4K VCs for the server board (with 512K control memory) and 1K
|
||||
VCs for the client board (with 128K control memory).
|
||||
- UBR, ABR and CBR service categories are supported.
|
||||
- Only AAL5 is supported.
|
||||
- Supports setting of PCR on the VCs.
|
||||
- Multiple adapters in a system are supported.
|
||||
- All variants of Interphase ATM PCI (i)Chip adapter cards are supported,
|
||||
including x575 (OC3, control memory 128K , 512K and packet memory 128K,
|
||||
512K and 1M), x525 (UTP25) and x531 (DS3 and E3). See
|
||||
http://www.iphase.com/
|
||||
for details.
|
||||
- Only x86 platforms are supported.
|
||||
- SMP is supported.
|
||||
|
||||
|
||||
Before You Start
|
||||
----------------
|
||||
|
||||
|
||||
Installation
|
||||
------------
|
||||
|
||||
1. Installing the adapters in the system
|
||||
To install the ATM adapters in the system, follow the steps below.
|
||||
a. Login as root.
|
||||
b. Shut down the system and power off the system.
|
||||
c. Install one or more ATM adapters in the system.
|
||||
d. Connect each adapter to a port on an ATM switch. The green 'Link'
|
||||
LED on the front panel of the adapter will be on if the adapter is
|
||||
connected to the switch properly when the system is powered up.
|
||||
e. Power on and boot the system.
|
||||
|
||||
2. [ Removed ]
|
||||
|
||||
3. Rebuild kernel with ABR support
|
||||
[ a. and b. removed ]
|
||||
c. Reconfigure the kernel, choose the Interphase ia driver through "make
|
||||
menuconfig" or "make xconfig".
|
||||
d. Rebuild the kernel, loadable modules and the atm tools.
|
||||
e. Install the new built kernel and modules and reboot.
|
||||
|
||||
4. Load the adapter hardware driver (ia driver) if it is built as a module
|
||||
a. Login as root.
|
||||
b. Change directory to /lib/modules/<kernel-version>/atm.
|
||||
c. Run "insmod suni.o;insmod iphase.o"
|
||||
The yellow 'status' LED on the front panel of the adapter will blink
|
||||
while the driver is loaded in the system.
|
||||
d. To verify that the 'ia' driver is loaded successfully, run the
|
||||
following command:
|
||||
|
||||
cat /proc/atm/devices
|
||||
|
||||
If the driver is loaded successfully, the output of the command will
|
||||
be similar to the following lines:
|
||||
|
||||
Itf Type ESI/"MAC"addr AAL(TX,err,RX,err,drop) ...
|
||||
0 ia xxxxxxxxx 0 ( 0 0 0 0 0 ) 5 ( 0 0 0 0 0 )
|
||||
|
||||
You can also check the system log file /var/log/messages for messages
|
||||
related to the ATM driver.
|
||||
|
||||
5. Ia Driver Configuration
|
||||
|
||||
5.1 Configuration of adapter buffers
|
||||
The (i)Chip boards have 3 different packet RAM size variants: 128K, 512K and
|
||||
1M. The RAM size decides the number of buffers and buffer size. The default
|
||||
size and number of buffers are set as following:
|
||||
|
||||
Total Rx RAM Tx RAM Rx Buf Tx Buf Rx buf Tx buf
|
||||
RAM size size size size size cnt cnt
|
||||
-------- ------ ------ ------ ------ ------ ------
|
||||
128K 64K 64K 10K 10K 6 6
|
||||
512K 256K 256K 10K 10K 25 25
|
||||
1M 512K 512K 10K 10K 51 51
|
||||
|
||||
These setting should work well in most environments, but can be
|
||||
changed by typing the following command:
|
||||
|
||||
insmod <IA_DIR>/ia.o IA_RX_BUF=<RX_CNT> IA_RX_BUF_SZ=<RX_SIZE> \
|
||||
IA_TX_BUF=<TX_CNT> IA_TX_BUF_SZ=<TX_SIZE>
|
||||
Where:
|
||||
RX_CNT = number of receive buffers in the range (1-128)
|
||||
RX_SIZE = size of receive buffers in the range (48-64K)
|
||||
TX_CNT = number of transmit buffers in the range (1-128)
|
||||
TX_SIZE = size of transmit buffers in the range (48-64K)
|
||||
|
||||
1. Transmit and receive buffer size must be a multiple of 4.
|
||||
2. Care should be taken so that the memory required for the
|
||||
transmit and receive buffers is less than or equal to the
|
||||
total adapter packet memory.
|
||||
|
||||
5.2 Turn on ia debug trace
|
||||
|
||||
When the ia driver is built with the CONFIG_ATM_IA_DEBUG flag, the driver
|
||||
can provide more debug trace if needed. There is a bit mask variable,
|
||||
IADebugFlag, which controls the output of the traces. You can find the bit
|
||||
map of the IADebugFlag in iphase.h.
|
||||
The debug trace can be turn on through the insmod command line option, for
|
||||
example, "insmod iphase.o IADebugFlag=0xffffffff" can turn on all the debug
|
||||
traces together with loading the driver.
|
||||
|
||||
6. Ia Driver Test Using ttcp_atm and PVC
|
||||
|
||||
For the PVC setup, the test machines can either be connected back-to-back or
|
||||
through a switch. If connected through the switch, the switch must be
|
||||
configured for the PVC(s).
|
||||
|
||||
a. For UBR test:
|
||||
At the test machine intended to receive data, type:
|
||||
ttcp_atm -r -a -s 0.100
|
||||
At the other test machine, type:
|
||||
ttcp_atm -t -a -s 0.100 -n 10000
|
||||
Run "ttcp_atm -h" to display more options of the ttcp_atm tool.
|
||||
b. For ABR test:
|
||||
It is the same as the UBR testing, but with an extra command option:
|
||||
-Pabr:max_pcr=<xxx>
|
||||
where:
|
||||
xxx = the maximum peak cell rate, from 170 - 353207.
|
||||
This option must be set on both the machines.
|
||||
c. For CBR test:
|
||||
It is the same as the UBR testing, but with an extra command option:
|
||||
-Pcbr:max_pcr=<xxx>
|
||||
where:
|
||||
xxx = the maximum peak cell rate, from 170 - 353207.
|
||||
This option may only be set on the transmit machine.
|
||||
|
||||
|
||||
OUTSTANDING ISSUES
|
||||
------------------
|
||||
|
||||
|
||||
|
||||
Contact Information
|
||||
-------------------
|
||||
|
||||
Customer Support:
|
||||
United States: Telephone: (214) 654-5555
|
||||
Fax: (214) 654-5500
|
||||
E-Mail: intouch@iphase.com
|
||||
Europe: Telephone: 33 (0)1 41 15 44 00
|
||||
Fax: 33 (0)1 41 15 12 13
|
||||
World Wide Web: http://www.iphase.com
|
||||
Anonymous FTP: ftp.iphase.com
|
||||
38
Documentation/networking/ipsec.txt
Normal file
38
Documentation/networking/ipsec.txt
Normal file
|
|
@ -0,0 +1,38 @@
|
|||
|
||||
Here documents known IPsec corner cases which need to be keep in mind when
|
||||
deploy various IPsec configuration in real world production environment.
|
||||
|
||||
1. IPcomp: Small IP packet won't get compressed at sender, and failed on
|
||||
policy check on receiver.
|
||||
|
||||
Quote from RFC3173:
|
||||
2.2. Non-Expansion Policy
|
||||
|
||||
If the total size of a compressed payload and the IPComp header, as
|
||||
defined in section 3, is not smaller than the size of the original
|
||||
payload, the IP datagram MUST be sent in the original non-compressed
|
||||
form. To clarify: If an IP datagram is sent non-compressed, no
|
||||
|
||||
IPComp header is added to the datagram. This policy ensures saving
|
||||
the decompression processing cycles and avoiding incurring IP
|
||||
datagram fragmentation when the expanded datagram is larger than the
|
||||
MTU.
|
||||
|
||||
Small IP datagrams are likely to expand as a result of compression.
|
||||
Therefore, a numeric threshold should be applied before compression,
|
||||
where IP datagrams of size smaller than the threshold are sent in the
|
||||
original form without attempting compression. The numeric threshold
|
||||
is implementation dependent.
|
||||
|
||||
Current IPComp implementation is indeed by the book, while as in practice
|
||||
when sending non-compressed packet to the peer(whether or not packet len
|
||||
is smaller than the threshold or the compressed len is large than original
|
||||
packet len), the packet is dropped when checking the policy as this packet
|
||||
matches the selector but not coming from any XFRM layer, i.e., with no
|
||||
security path. Such naked packet will not eventually make it to upper layer.
|
||||
The result is much more wired to the user when ping peer with different
|
||||
payload length.
|
||||
|
||||
One workaround is try to set "level use" for each policy if user observed
|
||||
above scenario. The consequence of doing so is small packet(uncompressed)
|
||||
will skip policy checking on receiver side.
|
||||
72
Documentation/networking/ipv6.txt
Normal file
72
Documentation/networking/ipv6.txt
Normal file
|
|
@ -0,0 +1,72 @@
|
|||
|
||||
Options for the ipv6 module are supplied as parameters at load time.
|
||||
|
||||
Module options may be given as command line arguments to the insmod
|
||||
or modprobe command, but are usually specified in either
|
||||
/etc/modules.d/*.conf configuration files, or in a distro-specific
|
||||
configuration file.
|
||||
|
||||
The available ipv6 module parameters are listed below. If a parameter
|
||||
is not specified the default value is used.
|
||||
|
||||
The parameters are as follows:
|
||||
|
||||
disable
|
||||
|
||||
Specifies whether to load the IPv6 module, but disable all
|
||||
its functionality. This might be used when another module
|
||||
has a dependency on the IPv6 module being loaded, but no
|
||||
IPv6 addresses or operations are desired.
|
||||
|
||||
The possible values and their effects are:
|
||||
|
||||
0
|
||||
IPv6 is enabled.
|
||||
|
||||
This is the default value.
|
||||
|
||||
1
|
||||
IPv6 is disabled.
|
||||
|
||||
No IPv6 addresses will be added to interfaces, and
|
||||
it will not be possible to open an IPv6 socket.
|
||||
|
||||
A reboot is required to enable IPv6.
|
||||
|
||||
autoconf
|
||||
|
||||
Specifies whether to enable IPv6 address autoconfiguration
|
||||
on all interfaces. This might be used when one does not wish
|
||||
for addresses to be automatically generated from prefixes
|
||||
received in Router Advertisements.
|
||||
|
||||
The possible values and their effects are:
|
||||
|
||||
0
|
||||
IPv6 address autoconfiguration is disabled on all interfaces.
|
||||
|
||||
Only the IPv6 loopback address (::1) and link-local addresses
|
||||
will be added to interfaces.
|
||||
|
||||
1
|
||||
IPv6 address autoconfiguration is enabled on all interfaces.
|
||||
|
||||
This is the default value.
|
||||
|
||||
disable_ipv6
|
||||
|
||||
Specifies whether to disable IPv6 on all interfaces.
|
||||
This might be used when no IPv6 addresses are desired.
|
||||
|
||||
The possible values and their effects are:
|
||||
|
||||
0
|
||||
IPv6 is enabled on all interfaces.
|
||||
|
||||
This is the default value.
|
||||
|
||||
1
|
||||
IPv6 is disabled on all interfaces.
|
||||
|
||||
No IPv6 addresses will be added to interfaces.
|
||||
|
||||
211
Documentation/networking/ipvs-sysctl.txt
Normal file
211
Documentation/networking/ipvs-sysctl.txt
Normal file
|
|
@ -0,0 +1,211 @@
|
|||
/proc/sys/net/ipv4/vs/* Variables:
|
||||
|
||||
am_droprate - INTEGER
|
||||
default 10
|
||||
|
||||
It sets the always mode drop rate, which is used in the mode 3
|
||||
of the drop_rate defense.
|
||||
|
||||
amemthresh - INTEGER
|
||||
default 1024
|
||||
|
||||
It sets the available memory threshold (in pages), which is
|
||||
used in the automatic modes of defense. When there is no
|
||||
enough available memory, the respective strategy will be
|
||||
enabled and the variable is automatically set to 2, otherwise
|
||||
the strategy is disabled and the variable is set to 1.
|
||||
|
||||
backup_only - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
|
||||
If set, disable the director function while the server is
|
||||
in backup mode to avoid packet loops for DR/TUN methods.
|
||||
|
||||
conntrack - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
|
||||
If set, maintain connection tracking entries for
|
||||
connections handled by IPVS.
|
||||
|
||||
This should be enabled if connections handled by IPVS are to be
|
||||
also handled by stateful firewall rules. That is, iptables rules
|
||||
that make use of connection tracking. It is a performance
|
||||
optimisation to disable this setting otherwise.
|
||||
|
||||
Connections handled by the IPVS FTP application module
|
||||
will have connection tracking entries regardless of this setting.
|
||||
|
||||
Only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled.
|
||||
|
||||
cache_bypass - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
|
||||
If it is enabled, forward packets to the original destination
|
||||
directly when no cache server is available and destination
|
||||
address is not local (iph->daddr is RTN_UNICAST). It is mostly
|
||||
used in transparent web cache cluster.
|
||||
|
||||
debug_level - INTEGER
|
||||
0 - transmission error messages (default)
|
||||
1 - non-fatal error messages
|
||||
2 - configuration
|
||||
3 - destination trash
|
||||
4 - drop entry
|
||||
5 - service lookup
|
||||
6 - scheduling
|
||||
7 - connection new/expire, lookup and synchronization
|
||||
8 - state transition
|
||||
9 - binding destination, template checks and applications
|
||||
10 - IPVS packet transmission
|
||||
11 - IPVS packet handling (ip_vs_in/ip_vs_out)
|
||||
12 or more - packet traversal
|
||||
|
||||
Only available when IPVS is compiled with CONFIG_IP_VS_DEBUG enabled.
|
||||
|
||||
Higher debugging levels include the messages for lower debugging
|
||||
levels, so setting debug level 2, includes level 0, 1 and 2
|
||||
messages. Thus, logging becomes more and more verbose the higher
|
||||
the level.
|
||||
|
||||
drop_entry - INTEGER
|
||||
0 - disabled (default)
|
||||
|
||||
The drop_entry defense is to randomly drop entries in the
|
||||
connection hash table, just in order to collect back some
|
||||
memory for new connections. In the current code, the
|
||||
drop_entry procedure can be activated every second, then it
|
||||
randomly scans 1/32 of the whole and drops entries that are in
|
||||
the SYN-RECV/SYNACK state, which should be effective against
|
||||
syn-flooding attack.
|
||||
|
||||
The valid values of drop_entry are from 0 to 3, where 0 means
|
||||
that this strategy is always disabled, 1 and 2 mean automatic
|
||||
modes (when there is no enough available memory, the strategy
|
||||
is enabled and the variable is automatically set to 2,
|
||||
otherwise the strategy is disabled and the variable is set to
|
||||
1), and 3 means that that the strategy is always enabled.
|
||||
|
||||
drop_packet - INTEGER
|
||||
0 - disabled (default)
|
||||
|
||||
The drop_packet defense is designed to drop 1/rate packets
|
||||
before forwarding them to real servers. If the rate is 1, then
|
||||
drop all the incoming packets.
|
||||
|
||||
The value definition is the same as that of the drop_entry. In
|
||||
the automatic mode, the rate is determined by the follow
|
||||
formula: rate = amemthresh / (amemthresh - available_memory)
|
||||
when available memory is less than the available memory
|
||||
threshold. When the mode 3 is set, the always mode drop rate
|
||||
is controlled by the /proc/sys/net/ipv4/vs/am_droprate.
|
||||
|
||||
expire_nodest_conn - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
|
||||
The default value is 0, the load balancer will silently drop
|
||||
packets when its destination server is not available. It may
|
||||
be useful, when user-space monitoring program deletes the
|
||||
destination server (because of server overload or wrong
|
||||
detection) and add back the server later, and the connections
|
||||
to the server can continue.
|
||||
|
||||
If this feature is enabled, the load balancer will expire the
|
||||
connection immediately when a packet arrives and its
|
||||
destination server is not available, then the client program
|
||||
will be notified that the connection is closed. This is
|
||||
equivalent to the feature some people requires to flush
|
||||
connections when its destination is not available.
|
||||
|
||||
expire_quiescent_template - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
|
||||
When set to a non-zero value, the load balancer will expire
|
||||
persistent templates when the destination server is quiescent.
|
||||
This may be useful, when a user makes a destination server
|
||||
quiescent by setting its weight to 0 and it is desired that
|
||||
subsequent otherwise persistent connections are sent to a
|
||||
different destination server. By default new persistent
|
||||
connections are allowed to quiescent destination servers.
|
||||
|
||||
If this feature is enabled, the load balancer will expire the
|
||||
persistence template if it is to be used to schedule a new
|
||||
connection and the destination server is quiescent.
|
||||
|
||||
nat_icmp_send - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
|
||||
It controls sending icmp error messages (ICMP_DEST_UNREACH)
|
||||
for VS/NAT when the load balancer receives packets from real
|
||||
servers but the connection entries don't exist.
|
||||
|
||||
secure_tcp - INTEGER
|
||||
0 - disabled (default)
|
||||
|
||||
The secure_tcp defense is to use a more complicated TCP state
|
||||
transition table. For VS/NAT, it also delays entering the
|
||||
TCP ESTABLISHED state until the three way handshake is completed.
|
||||
|
||||
The value definition is the same as that of drop_entry and
|
||||
drop_packet.
|
||||
|
||||
sync_threshold - INTEGER
|
||||
default 3
|
||||
|
||||
It sets synchronization threshold, which is the minimum number
|
||||
of incoming packets that a connection needs to receive before
|
||||
the connection will be synchronized. A connection will be
|
||||
synchronized, every time the number of its incoming packets
|
||||
modulus 50 equals the threshold. The range of the threshold is
|
||||
from 0 to 49.
|
||||
|
||||
snat_reroute - BOOLEAN
|
||||
0 - disabled
|
||||
not 0 - enabled (default)
|
||||
|
||||
If enabled, recalculate the route of SNATed packets from
|
||||
realservers so that they are routed as if they originate from the
|
||||
director. Otherwise they are routed as if they are forwarded by the
|
||||
director.
|
||||
|
||||
If policy routing is in effect then it is possible that the route
|
||||
of a packet originating from a director is routed differently to a
|
||||
packet being forwarded by the director.
|
||||
|
||||
If policy routing is not in effect then the recalculated route will
|
||||
always be the same as the original route so it is an optimisation
|
||||
to disable snat_reroute and avoid the recalculation.
|
||||
|
||||
sync_persist_mode - INTEGER
|
||||
default 0
|
||||
|
||||
Controls the synchronisation of connections when using persistence
|
||||
|
||||
0: All types of connections are synchronised
|
||||
1: Attempt to reduce the synchronisation traffic depending on
|
||||
the connection type. For persistent services avoid synchronisation
|
||||
for normal connections, do it only for persistence templates.
|
||||
In such case, for TCP and SCTP it may need enabling sloppy_tcp and
|
||||
sloppy_sctp flags on backup servers. For non-persistent services
|
||||
such optimization is not applied, mode 0 is assumed.
|
||||
|
||||
sync_version - INTEGER
|
||||
default 1
|
||||
|
||||
The version of the synchronisation protocol used when sending
|
||||
synchronisation messages.
|
||||
|
||||
0 selects the original synchronisation protocol (version 0). This
|
||||
should be used when sending synchronisation messages to a legacy
|
||||
system that only understands the original synchronisation protocol.
|
||||
|
||||
1 selects the current synchronisation protocol (version 1). This
|
||||
should be used where possible.
|
||||
|
||||
Kernels with this sync_version entry are able to receive messages
|
||||
of both version 1 and version 2 of the synchronisation protocol.
|
||||
10
Documentation/networking/irda.txt
Normal file
10
Documentation/networking/irda.txt
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
To use the IrDA protocols within Linux you will need to get a suitable copy
|
||||
of the IrDA Utilities. More detailed information about these and associated
|
||||
programs can be found on http://irda.sourceforge.net/
|
||||
|
||||
For more information about how to use the IrDA protocol stack, see the
|
||||
Linux Infrared HOWTO by Werner Heuser <wehe@tuxmobil.org>:
|
||||
<http://www.tuxmobil.org/Infrared-HOWTO/Infrared-HOWTO.html>
|
||||
|
||||
There is an active mailing list for discussing Linux-IrDA matters called
|
||||
irda-users@lists.sourceforge.net
|
||||
433
Documentation/networking/ixgb.txt
Normal file
433
Documentation/networking/ixgb.txt
Normal file
|
|
@ -0,0 +1,433 @@
|
|||
Linux Base Driver for 10 Gigabit Intel(R) Ethernet Network Connection
|
||||
=====================================================================
|
||||
|
||||
March 14, 2011
|
||||
|
||||
|
||||
Contents
|
||||
========
|
||||
|
||||
- In This Release
|
||||
- Identifying Your Adapter
|
||||
- Building and Installation
|
||||
- Command Line Parameters
|
||||
- Improving Performance
|
||||
- Additional Configurations
|
||||
- Known Issues/Troubleshooting
|
||||
- Support
|
||||
|
||||
|
||||
|
||||
In This Release
|
||||
===============
|
||||
|
||||
This file describes the ixgb Linux Base Driver for the 10 Gigabit Intel(R)
|
||||
Network Connection. This driver includes support for Itanium(R)2-based
|
||||
systems.
|
||||
|
||||
For questions related to hardware requirements, refer to the documentation
|
||||
supplied with your 10 Gigabit adapter. All hardware requirements listed apply
|
||||
to use with Linux.
|
||||
|
||||
The following features are available in this kernel:
|
||||
- Native VLANs
|
||||
- Channel Bonding (teaming)
|
||||
- SNMP
|
||||
|
||||
Channel Bonding documentation can be found in the Linux kernel source:
|
||||
/Documentation/networking/bonding.txt
|
||||
|
||||
The driver information previously displayed in the /proc filesystem is not
|
||||
supported in this release. Alternatively, you can use ethtool (version 1.6
|
||||
or later), lspci, and ifconfig to obtain the same information.
|
||||
|
||||
Instructions on updating ethtool can be found in the section "Additional
|
||||
Configurations" later in this document.
|
||||
|
||||
|
||||
Identifying Your Adapter
|
||||
========================
|
||||
|
||||
The following Intel network adapters are compatible with the drivers in this
|
||||
release:
|
||||
|
||||
Controller Adapter Name Physical Layer
|
||||
---------- ------------ --------------
|
||||
82597EX Intel(R) PRO/10GbE LR/SR/CX4 10G Base-LR (1310 nm optical fiber)
|
||||
Server Adapters 10G Base-SR (850 nm optical fiber)
|
||||
10G Base-CX4(twin-axial copper cabling)
|
||||
|
||||
For more information on how to identify your adapter, go to the Adapter &
|
||||
Driver ID Guide at:
|
||||
|
||||
http://support.intel.com/support/network/sb/CS-012904.htm
|
||||
|
||||
|
||||
Building and Installation
|
||||
=========================
|
||||
|
||||
select m for "Intel(R) PRO/10GbE support" located at:
|
||||
Location:
|
||||
-> Device Drivers
|
||||
-> Network device support (NETDEVICES [=y])
|
||||
-> Ethernet (10000 Mbit) (NETDEV_10000 [=y])
|
||||
1. make modules && make modules_install
|
||||
|
||||
2. Load the module:
|
||||
|
||||
modprobe ixgb <parameter>=<value>
|
||||
|
||||
The insmod command can be used if the full
|
||||
path to the driver module is specified. For example:
|
||||
|
||||
insmod /lib/modules/<KERNEL VERSION>/kernel/drivers/net/ixgb/ixgb.ko
|
||||
|
||||
With 2.6 based kernels also make sure that older ixgb drivers are
|
||||
removed from the kernel, before loading the new module:
|
||||
|
||||
rmmod ixgb; modprobe ixgb
|
||||
|
||||
3. Assign an IP address to the interface by entering the following, where
|
||||
x is the interface number:
|
||||
|
||||
ifconfig ethx <IP_address>
|
||||
|
||||
4. Verify that the interface works. Enter the following, where <IP_address>
|
||||
is the IP address for another machine on the same subnet as the interface
|
||||
that is being tested:
|
||||
|
||||
ping <IP_address>
|
||||
|
||||
|
||||
Command Line Parameters
|
||||
=======================
|
||||
|
||||
If the driver is built as a module, the following optional parameters are
|
||||
used by entering them on the command line with the modprobe command using
|
||||
this syntax:
|
||||
|
||||
modprobe ixgb [<option>=<VAL1>,<VAL2>,...]
|
||||
|
||||
For example, with two 10GbE PCI adapters, entering:
|
||||
|
||||
modprobe ixgb TxDescriptors=80,128
|
||||
|
||||
loads the ixgb driver with 80 TX resources for the first adapter and 128 TX
|
||||
resources for the second adapter.
|
||||
|
||||
The default value for each parameter is generally the recommended setting,
|
||||
unless otherwise noted.
|
||||
|
||||
FlowControl
|
||||
Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx)
|
||||
Default: Read from the EEPROM
|
||||
If EEPROM is not detected, default is 1
|
||||
This parameter controls the automatic generation(Tx) and response(Rx) to
|
||||
Ethernet PAUSE frames. There are hardware bugs associated with enabling
|
||||
Tx flow control so beware.
|
||||
|
||||
RxDescriptors
|
||||
Valid Range: 64-512
|
||||
Default Value: 512
|
||||
This value is the number of receive descriptors allocated by the driver.
|
||||
Increasing this value allows the driver to buffer more incoming packets.
|
||||
Each descriptor is 16 bytes. A receive buffer is also allocated for
|
||||
each descriptor and can be either 2048, 4056, 8192, or 16384 bytes,
|
||||
depending on the MTU setting. When the MTU size is 1500 or less, the
|
||||
receive buffer size is 2048 bytes. When the MTU is greater than 1500 the
|
||||
receive buffer size will be either 4056, 8192, or 16384 bytes. The
|
||||
maximum MTU size is 16114.
|
||||
|
||||
RxIntDelay
|
||||
Valid Range: 0-65535 (0=off)
|
||||
Default Value: 72
|
||||
This value delays the generation of receive interrupts in units of
|
||||
0.8192 microseconds. Receive interrupt reduction can improve CPU
|
||||
efficiency if properly tuned for specific network traffic. Increasing
|
||||
this value adds extra latency to frame reception and can end up
|
||||
decreasing the throughput of TCP traffic. If the system is reporting
|
||||
dropped receives, this value may be set too high, causing the driver to
|
||||
run out of available receive descriptors.
|
||||
|
||||
TxDescriptors
|
||||
Valid Range: 64-4096
|
||||
Default Value: 256
|
||||
This value is the number of transmit descriptors allocated by the driver.
|
||||
Increasing this value allows the driver to queue more transmits. Each
|
||||
descriptor is 16 bytes.
|
||||
|
||||
XsumRX
|
||||
Valid Range: 0-1
|
||||
Default Value: 1
|
||||
A value of '1' indicates that the driver should enable IP checksum
|
||||
offload for received packets (both UDP and TCP) to the adapter hardware.
|
||||
|
||||
|
||||
Improving Performance
|
||||
=====================
|
||||
|
||||
With the 10 Gigabit server adapters, the default Linux configuration will
|
||||
very likely limit the total available throughput artificially. There is a set
|
||||
of configuration changes that, when applied together, will increase the ability
|
||||
of Linux to transmit and receive data. The following enhancements were
|
||||
originally acquired from settings published at http://www.spec.org/web99/ for
|
||||
various submitted results using Linux.
|
||||
|
||||
NOTE: These changes are only suggestions, and serve as a starting point for
|
||||
tuning your network performance.
|
||||
|
||||
The changes are made in three major ways, listed in order of greatest effect:
|
||||
- Use ifconfig to modify the mtu (maximum transmission unit) and the txqueuelen
|
||||
parameter.
|
||||
- Use sysctl to modify /proc parameters (essentially kernel tuning)
|
||||
- Use setpci to modify the MMRBC field in PCI-X configuration space to increase
|
||||
transmit burst lengths on the bus.
|
||||
|
||||
NOTE: setpci modifies the adapter's configuration registers to allow it to read
|
||||
up to 4k bytes at a time (for transmits). However, for some systems the
|
||||
behavior after modifying this register may be undefined (possibly errors of
|
||||
some kind). A power-cycle, hard reset or explicitly setting the e6 register
|
||||
back to 22 (setpci -d 8086:1a48 e6.b=22) may be required to get back to a
|
||||
stable configuration.
|
||||
|
||||
- COPY these lines and paste them into ixgb_perf.sh:
|
||||
#!/bin/bash
|
||||
echo "configuring network performance , edit this file to change the interface
|
||||
or device ID of 10GbE card"
|
||||
# set mmrbc to 4k reads, modify only Intel 10GbE device IDs
|
||||
# replace 1a48 with appropriate 10GbE device's ID installed on the system,
|
||||
# if needed.
|
||||
setpci -d 8086:1a48 e6.b=2e
|
||||
# set the MTU (max transmission unit) - it requires your switch and clients
|
||||
# to change as well.
|
||||
# set the txqueuelen
|
||||
# your ixgb adapter should be loaded as eth1 for this to work, change if needed
|
||||
ifconfig eth1 mtu 9000 txqueuelen 1000 up
|
||||
# call the sysctl utility to modify /proc/sys entries
|
||||
sysctl -p ./sysctl_ixgb.conf
|
||||
- END ixgb_perf.sh
|
||||
|
||||
- COPY these lines and paste them into sysctl_ixgb.conf:
|
||||
# some of the defaults may be different for your kernel
|
||||
# call this file with sysctl -p <this file>
|
||||
# these are just suggested values that worked well to increase throughput in
|
||||
# several network benchmark tests, your mileage may vary
|
||||
|
||||
### IPV4 specific settings
|
||||
# turn TCP timestamp support off, default 1, reduces CPU use
|
||||
net.ipv4.tcp_timestamps = 0
|
||||
# turn SACK support off, default on
|
||||
# on systems with a VERY fast bus -> memory interface this is the big gainer
|
||||
net.ipv4.tcp_sack = 0
|
||||
# set min/default/max TCP read buffer, default 4096 87380 174760
|
||||
net.ipv4.tcp_rmem = 10000000 10000000 10000000
|
||||
# set min/pressure/max TCP write buffer, default 4096 16384 131072
|
||||
net.ipv4.tcp_wmem = 10000000 10000000 10000000
|
||||
# set min/pressure/max TCP buffer space, default 31744 32256 32768
|
||||
net.ipv4.tcp_mem = 10000000 10000000 10000000
|
||||
|
||||
### CORE settings (mostly for socket and UDP effect)
|
||||
# set maximum receive socket buffer size, default 131071
|
||||
net.core.rmem_max = 524287
|
||||
# set maximum send socket buffer size, default 131071
|
||||
net.core.wmem_max = 524287
|
||||
# set default receive socket buffer size, default 65535
|
||||
net.core.rmem_default = 524287
|
||||
# set default send socket buffer size, default 65535
|
||||
net.core.wmem_default = 524287
|
||||
# set maximum amount of option memory buffers, default 10240
|
||||
net.core.optmem_max = 524287
|
||||
# set number of unprocessed input packets before kernel starts dropping them; default 300
|
||||
net.core.netdev_max_backlog = 300000
|
||||
- END sysctl_ixgb.conf
|
||||
|
||||
Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface
|
||||
your ixgb driver is using and/or replace '1a48' with appropriate 10GbE device's
|
||||
ID installed on the system.
|
||||
|
||||
NOTE: Unless these scripts are added to the boot process, these changes will
|
||||
only last only until the next system reboot.
|
||||
|
||||
|
||||
Resolving Slow UDP Traffic
|
||||
--------------------------
|
||||
If your server does not seem to be able to receive UDP traffic as fast as it
|
||||
can receive TCP traffic, it could be because Linux, by default, does not set
|
||||
the network stack buffers as large as they need to be to support high UDP
|
||||
transfer rates. One way to alleviate this problem is to allow more memory to
|
||||
be used by the IP stack to store incoming data.
|
||||
|
||||
For instance, use the commands:
|
||||
sysctl -w net.core.rmem_max=262143
|
||||
and
|
||||
sysctl -w net.core.rmem_default=262143
|
||||
to increase the read buffer memory max and default to 262143 (256k - 1) from
|
||||
defaults of max=131071 (128k - 1) and default=65535 (64k - 1). These variables
|
||||
will increase the amount of memory used by the network stack for receives, and
|
||||
can be increased significantly more if necessary for your application.
|
||||
|
||||
|
||||
Additional Configurations
|
||||
=========================
|
||||
|
||||
Configuring the Driver on Different Distributions
|
||||
-------------------------------------------------
|
||||
Configuring a network driver to load properly when the system is started is
|
||||
distribution dependent. Typically, the configuration process involves adding
|
||||
an alias line to /etc/modprobe.conf as well as editing other system startup
|
||||
scripts and/or configuration files. Many popular Linux distributions ship
|
||||
with tools to make these changes for you. To learn the proper way to
|
||||
configure a network device for your system, refer to your distribution
|
||||
documentation. If during this process you are asked for the driver or module
|
||||
name, the name for the Linux Base Driver for the Intel 10GbE Family of
|
||||
Adapters is ixgb.
|
||||
|
||||
Viewing Link Messages
|
||||
---------------------
|
||||
Link messages will not be displayed to the console if the distribution is
|
||||
restricting system messages. In order to see network driver link messages on
|
||||
your console, set dmesg to eight by entering the following:
|
||||
|
||||
dmesg -n 8
|
||||
|
||||
NOTE: This setting is not saved across reboots.
|
||||
|
||||
|
||||
Jumbo Frames
|
||||
------------
|
||||
The driver supports Jumbo Frames for all adapters. Jumbo Frames support is
|
||||
enabled by changing the MTU to a value larger than the default of 1500.
|
||||
The maximum value for the MTU is 16114. Use the ifconfig command to
|
||||
increase the MTU size. For example:
|
||||
|
||||
ifconfig ethx mtu 9000 up
|
||||
|
||||
The maximum MTU setting for Jumbo Frames is 16114. This value coincides
|
||||
with the maximum Jumbo Frames size of 16128.
|
||||
|
||||
|
||||
ethtool
|
||||
-------
|
||||
The driver utilizes the ethtool interface for driver configuration and
|
||||
diagnostics, as well as displaying statistical information. The ethtool
|
||||
version 1.6 or later is required for this functionality.
|
||||
|
||||
The latest release of ethtool can be found from
|
||||
http://ftp.kernel.org/pub/software/network/ethtool/
|
||||
|
||||
NOTE: The ethtool version 1.6 only supports a limited set of ethtool options.
|
||||
Support for a more complete ethtool feature set can be enabled by
|
||||
upgrading to the latest version.
|
||||
|
||||
|
||||
NAPI
|
||||
----
|
||||
|
||||
NAPI (Rx polling mode) is supported in the ixgb driver. NAPI is enabled
|
||||
or disabled based on the configuration of the kernel. see CONFIG_IXGB_NAPI
|
||||
|
||||
See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI.
|
||||
|
||||
|
||||
Known Issues/Troubleshooting
|
||||
============================
|
||||
|
||||
NOTE: After installing the driver, if your Intel Network Connection is not
|
||||
working, verify in the "In This Release" section of the readme that you have
|
||||
installed the correct driver.
|
||||
|
||||
Intel(R) PRO/10GbE CX4 Server Adapter Cable Interoperability Issue with
|
||||
Fujitsu XENPAK Module in SmartBits Chassis
|
||||
---------------------------------------------------------------------
|
||||
Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4
|
||||
Server adapter is connected to a Fujitsu XENPAK CX4 module in a SmartBits
|
||||
chassis using 15 m/24AWG cable assemblies manufactured by Fujitsu or Leoni.
|
||||
The CRC errors may be received either by the Intel(R) PRO/10GbE CX4
|
||||
Server adapter or the SmartBits. If this situation occurs using a different
|
||||
cable assembly may resolve the issue.
|
||||
|
||||
CX4 Server Adapter Cable Interoperability Issues with HP Procurve 3400cl
|
||||
Switch Port
|
||||
------------------------------------------------------------------------
|
||||
Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4 Server
|
||||
adapter is connected to an HP Procurve 3400cl switch port using short cables
|
||||
(1 m or shorter). If this situation occurs, using a longer cable may resolve
|
||||
the issue.
|
||||
|
||||
Excessive CRC errors may be observed using Fujitsu 24AWG cable assemblies that
|
||||
Are 10 m or longer or where using a Leoni 15 m/24AWG cable assembly. The CRC
|
||||
errors may be received either by the CX4 Server adapter or at the switch. If
|
||||
this situation occurs, using a different cable assembly may resolve the issue.
|
||||
|
||||
|
||||
Jumbo Frames System Requirement
|
||||
-------------------------------
|
||||
Memory allocation failures have been observed on Linux systems with 64 MB
|
||||
of RAM or less that are running Jumbo Frames. If you are using Jumbo
|
||||
Frames, your system may require more than the advertised minimum
|
||||
requirement of 64 MB of system memory.
|
||||
|
||||
|
||||
Performance Degradation with Jumbo Frames
|
||||
-----------------------------------------
|
||||
Degradation in throughput performance may be observed in some Jumbo frames
|
||||
environments. If this is observed, increasing the application's socket buffer
|
||||
size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values may help.
|
||||
See the specific application manual and /usr/src/linux*/Documentation/
|
||||
networking/ip-sysctl.txt for more details.
|
||||
|
||||
|
||||
Allocating Rx Buffers when Using Jumbo Frames
|
||||
---------------------------------------------
|
||||
Allocating Rx buffers when using Jumbo Frames on 2.6.x kernels may fail if
|
||||
the available memory is heavily fragmented. This issue may be seen with PCI-X
|
||||
adapters or with packet split disabled. This can be reduced or eliminated
|
||||
by changing the amount of available memory for receive buffer allocation, by
|
||||
increasing /proc/sys/vm/min_free_kbytes.
|
||||
|
||||
|
||||
Multiple Interfaces on Same Ethernet Broadcast Network
|
||||
------------------------------------------------------
|
||||
Due to the default ARP behavior on Linux, it is not possible to have
|
||||
one system on two IP networks in the same Ethernet broadcast domain
|
||||
(non-partitioned switch) behave as expected. All Ethernet interfaces
|
||||
will respond to IP traffic for any IP address assigned to the system.
|
||||
This results in unbalanced receive traffic.
|
||||
|
||||
If you have multiple interfaces in a server, do either of the following:
|
||||
|
||||
- Turn on ARP filtering by entering:
|
||||
echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
|
||||
|
||||
- Install the interfaces in separate broadcast domains - either in
|
||||
different switches or in a switch partitioned to VLANs.
|
||||
|
||||
|
||||
UDP Stress Test Dropped Packet Issue
|
||||
--------------------------------------
|
||||
Under small packets UDP stress test with 10GbE driver, the Linux system
|
||||
may drop UDP packets due to the fullness of socket buffers. You may want
|
||||
to change the driver's Flow Control variables to the minimum value for
|
||||
controlling packet reception.
|
||||
|
||||
|
||||
Tx Hangs Possible Under Stress
|
||||
------------------------------
|
||||
Under stress conditions, if TX hangs occur, turning off TSO
|
||||
"ethtool -K eth0 tso off" may resolve the problem.
|
||||
|
||||
|
||||
Support
|
||||
=======
|
||||
|
||||
For general information, go to the Intel support website at:
|
||||
|
||||
http://support.intel.com
|
||||
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
|
||||
http://sourceforge.net/projects/e1000
|
||||
|
||||
If an issue is identified with the released source code on the supported
|
||||
kernel with a supported adapter, email the specific information related
|
||||
to the issue to e1000-devel@lists.sf.net
|
||||
349
Documentation/networking/ixgbe.txt
Normal file
349
Documentation/networking/ixgbe.txt
Normal file
|
|
@ -0,0 +1,349 @@
|
|||
Linux* Base Driver for the Intel(R) Ethernet 10 Gigabit PCI Express Family of
|
||||
Adapters
|
||||
=============================================================================
|
||||
|
||||
Intel 10 Gigabit Linux driver.
|
||||
Copyright(c) 1999 - 2013 Intel Corporation.
|
||||
|
||||
Contents
|
||||
========
|
||||
|
||||
- Identifying Your Adapter
|
||||
- Additional Configurations
|
||||
- Performance Tuning
|
||||
- Known Issues
|
||||
- Support
|
||||
|
||||
Identifying Your Adapter
|
||||
========================
|
||||
|
||||
The driver in this release is compatible with 82598, 82599 and X540-based
|
||||
Intel Network Connections.
|
||||
|
||||
For more information on how to identify your adapter, go to the Adapter &
|
||||
Driver ID Guide at:
|
||||
|
||||
http://support.intel.com/support/network/sb/CS-012904.htm
|
||||
|
||||
SFP+ Devices with Pluggable Optics
|
||||
----------------------------------
|
||||
|
||||
82599-BASED ADAPTERS
|
||||
|
||||
NOTES: If your 82599-based Intel(R) Network Adapter came with Intel optics, or
|
||||
is an Intel(R) Ethernet Server Adapter X520-2, then it only supports Intel
|
||||
optics and/or the direct attach cables listed below.
|
||||
|
||||
When 82599-based SFP+ devices are connected back to back, they should be set to
|
||||
the same Speed setting via ethtool. Results may vary if you mix speed settings.
|
||||
82598-based adapters support all passive direct attach cables that comply
|
||||
with SFF-8431 v4.1 and SFF-8472 v10.4 specifications. Active direct attach
|
||||
cables are not supported.
|
||||
|
||||
Supplier Type Part Numbers
|
||||
|
||||
SR Modules
|
||||
Intel DUAL RATE 1G/10G SFP+ SR (bailed) FTLX8571D3BCV-IT
|
||||
Intel DUAL RATE 1G/10G SFP+ SR (bailed) AFBR-703SDDZ-IN1
|
||||
Intel DUAL RATE 1G/10G SFP+ SR (bailed) AFBR-703SDZ-IN2
|
||||
LR Modules
|
||||
Intel DUAL RATE 1G/10G SFP+ LR (bailed) FTLX1471D3BCV-IT
|
||||
Intel DUAL RATE 1G/10G SFP+ LR (bailed) AFCT-701SDDZ-IN1
|
||||
Intel DUAL RATE 1G/10G SFP+ LR (bailed) AFCT-701SDZ-IN2
|
||||
|
||||
The following is a list of 3rd party SFP+ modules and direct attach cables that
|
||||
have received some testing. Not all modules are applicable to all devices.
|
||||
|
||||
Supplier Type Part Numbers
|
||||
|
||||
Finisar SFP+ SR bailed, 10g single rate FTLX8571D3BCL
|
||||
Avago SFP+ SR bailed, 10g single rate AFBR-700SDZ
|
||||
Finisar SFP+ LR bailed, 10g single rate FTLX1471D3BCL
|
||||
|
||||
Finisar DUAL RATE 1G/10G SFP+ SR (No Bail) FTLX8571D3QCV-IT
|
||||
Avago DUAL RATE 1G/10G SFP+ SR (No Bail) AFBR-703SDZ-IN1
|
||||
Finisar DUAL RATE 1G/10G SFP+ LR (No Bail) FTLX1471D3QCV-IT
|
||||
Avago DUAL RATE 1G/10G SFP+ LR (No Bail) AFCT-701SDZ-IN1
|
||||
Finistar 1000BASE-T SFP FCLF8522P2BTL
|
||||
Avago 1000BASE-T SFP ABCU-5710RZ
|
||||
|
||||
82599-based adapters support all passive and active limiting direct attach
|
||||
cables that comply with SFF-8431 v4.1 and SFF-8472 v10.4 specifications.
|
||||
|
||||
Laser turns off for SFP+ when ifconfig down
|
||||
-------------------------------------------
|
||||
"ifconfig down" turns off the laser for 82599-based SFP+ fiber adapters.
|
||||
"ifconfig up" turns on the laser.
|
||||
|
||||
|
||||
82598-BASED ADAPTERS
|
||||
|
||||
NOTES for 82598-Based Adapters:
|
||||
- Intel(R) Network Adapters that support removable optical modules only support
|
||||
their original module type (i.e., the Intel(R) 10 Gigabit SR Dual Port
|
||||
Express Module only supports SR optical modules). If you plug in a different
|
||||
type of module, the driver will not load.
|
||||
- Hot Swapping/hot plugging optical modules is not supported.
|
||||
- Only single speed, 10 gigabit modules are supported.
|
||||
- LAN on Motherboard (LOMs) may support DA, SR, or LR modules. Other module
|
||||
types are not supported. Please see your system documentation for details.
|
||||
|
||||
The following is a list of 3rd party SFP+ modules and direct attach cables that
|
||||
have received some testing. Not all modules are applicable to all devices.
|
||||
|
||||
Supplier Type Part Numbers
|
||||
|
||||
Finisar SFP+ SR bailed, 10g single rate FTLX8571D3BCL
|
||||
Avago SFP+ SR bailed, 10g single rate AFBR-700SDZ
|
||||
Finisar SFP+ LR bailed, 10g single rate FTLX1471D3BCL
|
||||
|
||||
82598-based adapters support all passive direct attach cables that comply
|
||||
with SFF-8431 v4.1 and SFF-8472 v10.4 specifications. Active direct attach
|
||||
cables are not supported.
|
||||
|
||||
|
||||
Flow Control
|
||||
------------
|
||||
Ethernet Flow Control (IEEE 802.3x) can be configured with ethtool to enable
|
||||
receiving and transmitting pause frames for ixgbe. When TX is enabled, PAUSE
|
||||
frames are generated when the receive packet buffer crosses a predefined
|
||||
threshold. When rx is enabled, the transmit unit will halt for the time delay
|
||||
specified when a PAUSE frame is received.
|
||||
|
||||
Flow Control is enabled by default. If you want to disable a flow control
|
||||
capable link partner, use ethtool:
|
||||
|
||||
ethtool -A eth? autoneg off RX off TX off
|
||||
|
||||
NOTE: For 82598 backplane cards entering 1 gig mode, flow control default
|
||||
behavior is changed to off. Flow control in 1 gig mode on these devices can
|
||||
lead to Tx hangs.
|
||||
|
||||
Intel(R) Ethernet Flow Director
|
||||
-------------------------------
|
||||
Supports advanced filters that direct receive packets by their flows to
|
||||
different queues. Enables tight control on routing a flow in the platform.
|
||||
Matches flows and CPU cores for flow affinity. Supports multiple parameters
|
||||
for flexible flow classification and load balancing.
|
||||
|
||||
Flow director is enabled only if the kernel is multiple TX queue capable.
|
||||
|
||||
An included script (set_irq_affinity.sh) automates setting the IRQ to CPU
|
||||
affinity.
|
||||
|
||||
You can verify that the driver is using Flow Director by looking at the counter
|
||||
in ethtool: fdir_miss and fdir_match.
|
||||
|
||||
Other ethtool Commands:
|
||||
To enable Flow Director
|
||||
ethtool -K ethX ntuple on
|
||||
To add a filter
|
||||
Use -U switch. e.g., ethtool -U ethX flow-type tcp4 src-ip 0x178000a
|
||||
action 1
|
||||
To see the list of filters currently present:
|
||||
ethtool -u ethX
|
||||
|
||||
Perfect Filter: Perfect filter is an interface to load the filter table that
|
||||
funnels all flow into queue_0 unless an alternative queue is specified using
|
||||
"action". In that case, any flow that matches the filter criteria will be
|
||||
directed to the appropriate queue.
|
||||
|
||||
If the queue is defined as -1, filter will drop matching packets.
|
||||
|
||||
To account for filter matches and misses, there are two stats in ethtool:
|
||||
fdir_match and fdir_miss. In addition, rx_queue_N_packets shows the number of
|
||||
packets processed by the Nth queue.
|
||||
|
||||
NOTE: Receive Packet Steering (RPS) and Receive Flow Steering (RFS) are not
|
||||
compatible with Flow Director. IF Flow Director is enabled, these will be
|
||||
disabled.
|
||||
|
||||
The following three parameters impact Flow Director.
|
||||
|
||||
FdirMode
|
||||
--------
|
||||
Valid Range: 0-2 (0=off, 1=ATR, 2=Perfect filter mode)
|
||||
Default Value: 1
|
||||
|
||||
Flow Director filtering modes.
|
||||
|
||||
FdirPballoc
|
||||
-----------
|
||||
Valid Range: 0-2 (0=64k, 1=128k, 2=256k)
|
||||
Default Value: 0
|
||||
|
||||
Flow Director allocated packet buffer size.
|
||||
|
||||
AtrSampleRate
|
||||
--------------
|
||||
Valid Range: 1-100
|
||||
Default Value: 20
|
||||
|
||||
Software ATR Tx packet sample rate. For example, when set to 20, every 20th
|
||||
packet, looks to see if the packet will create a new flow.
|
||||
|
||||
Node
|
||||
----
|
||||
Valid Range: 0-n
|
||||
Default Value: 1 (off)
|
||||
|
||||
0 - n: where n is the number of NUMA nodes (i.e. 0 - 3) currently online in
|
||||
your system
|
||||
1: turns this option off
|
||||
|
||||
The Node parameter will allow you to pick which NUMA node you want to have
|
||||
the adapter allocate memory on.
|
||||
|
||||
max_vfs
|
||||
-------
|
||||
Valid Range: 1-63
|
||||
Default Value: 0
|
||||
|
||||
If the value is greater than 0 it will also force the VMDq parameter to be 1
|
||||
or more.
|
||||
|
||||
This parameter adds support for SR-IOV. It causes the driver to spawn up to
|
||||
max_vfs worth of virtual function.
|
||||
|
||||
|
||||
Additional Configurations
|
||||
=========================
|
||||
|
||||
Jumbo Frames
|
||||
------------
|
||||
The driver supports Jumbo Frames for all adapters. Jumbo Frames support is
|
||||
enabled by changing the MTU to a value larger than the default of 1500.
|
||||
The maximum value for the MTU is 16110. Use the ifconfig command to
|
||||
increase the MTU size. For example:
|
||||
|
||||
ifconfig ethx mtu 9000 up
|
||||
|
||||
The maximum MTU setting for Jumbo Frames is 16110. This value coincides
|
||||
with the maximum Jumbo Frames size of 16128.
|
||||
|
||||
Generic Receive Offload, aka GRO
|
||||
--------------------------------
|
||||
The driver supports the in-kernel software implementation of GRO. GRO has
|
||||
shown that by coalescing Rx traffic into larger chunks of data, CPU
|
||||
utilization can be significantly reduced when under large Rx load. GRO is an
|
||||
evolution of the previously-used LRO interface. GRO is able to coalesce
|
||||
other protocols besides TCP. It's also safe to use with configurations that
|
||||
are problematic for LRO, namely bridging and iSCSI.
|
||||
|
||||
Data Center Bridging, aka DCB
|
||||
-----------------------------
|
||||
DCB is a configuration Quality of Service implementation in hardware.
|
||||
It uses the VLAN priority tag (802.1p) to filter traffic. That means
|
||||
that there are 8 different priorities that traffic can be filtered into.
|
||||
It also enables priority flow control which can limit or eliminate the
|
||||
number of dropped packets during network stress. Bandwidth can be
|
||||
allocated to each of these priorities, which is enforced at the hardware
|
||||
level.
|
||||
|
||||
To enable DCB support in ixgbe, you must enable the DCB netlink layer to
|
||||
allow the userspace tools (see below) to communicate with the driver.
|
||||
This can be found in the kernel configuration here:
|
||||
|
||||
-> Networking support
|
||||
-> Networking options
|
||||
-> Data Center Bridging support
|
||||
|
||||
Once this is selected, DCB support must be selected for ixgbe. This can
|
||||
be found here:
|
||||
|
||||
-> Device Drivers
|
||||
-> Network device support (NETDEVICES [=y])
|
||||
-> Ethernet (10000 Mbit) (NETDEV_10000 [=y])
|
||||
-> Intel(R) 10GbE PCI Express adapters support
|
||||
-> Data Center Bridging (DCB) Support
|
||||
|
||||
After these options are selected, you must rebuild your kernel and your
|
||||
modules.
|
||||
|
||||
In order to use DCB, userspace tools must be downloaded and installed.
|
||||
The dcbd tools can be found at:
|
||||
|
||||
http://e1000.sf.net
|
||||
|
||||
Ethtool
|
||||
-------
|
||||
The driver utilizes the ethtool interface for driver configuration and
|
||||
diagnostics, as well as displaying statistical information. The latest
|
||||
ethtool version is required for this functionality.
|
||||
|
||||
The latest release of ethtool can be found from
|
||||
http://ftp.kernel.org/pub/software/network/ethtool/
|
||||
|
||||
FCoE
|
||||
----
|
||||
This release of the ixgbe driver contains new code to enable users to use
|
||||
Fiber Channel over Ethernet (FCoE) and Data Center Bridging (DCB)
|
||||
functionality that is supported by the 82598-based hardware. This code has
|
||||
no default effect on the regular driver operation, and configuring DCB and
|
||||
FCoE is outside the scope of this driver README. Refer to
|
||||
http://www.open-fcoe.org/ for FCoE project information and contact
|
||||
e1000-eedc@lists.sourceforge.net for DCB information.
|
||||
|
||||
MAC and VLAN anti-spoofing feature
|
||||
----------------------------------
|
||||
When a malicious driver attempts to send a spoofed packet, it is dropped by
|
||||
the hardware and not transmitted. An interrupt is sent to the PF driver
|
||||
notifying it of the spoof attempt.
|
||||
|
||||
When a spoofed packet is detected the PF driver will send the following
|
||||
message to the system log (displayed by the "dmesg" command):
|
||||
|
||||
Spoof event(s) detected on VF (n)
|
||||
|
||||
Where n=the VF that attempted to do the spoofing.
|
||||
|
||||
|
||||
Performance Tuning
|
||||
==================
|
||||
|
||||
An excellent article on performance tuning can be found at:
|
||||
|
||||
http://www.redhat.com/promo/summit/2008/downloads/pdf/Thursday/Mark_Wagner.pdf
|
||||
|
||||
|
||||
Known Issues
|
||||
============
|
||||
|
||||
Enabling SR-IOV in a 32-bit or 64-bit Microsoft* Windows* Server 2008/R2
|
||||
Guest OS using Intel (R) 82576-based GbE or Intel (R) 82599-based 10GbE
|
||||
controller under KVM
|
||||
------------------------------------------------------------------------
|
||||
KVM Hypervisor/VMM supports direct assignment of a PCIe device to a VM. This
|
||||
includes traditional PCIe devices, as well as SR-IOV-capable devices using
|
||||
Intel 82576-based and 82599-based controllers.
|
||||
|
||||
While direct assignment of a PCIe device or an SR-IOV Virtual Function (VF)
|
||||
to a Linux-based VM running 2.6.32 or later kernel works fine, there is a
|
||||
known issue with Microsoft Windows Server 2008 VM that results in a "yellow
|
||||
bang" error. This problem is within the KVM VMM itself, not the Intel driver,
|
||||
or the SR-IOV logic of the VMM, but rather that KVM emulates an older CPU
|
||||
model for the guests, and this older CPU model does not support MSI-X
|
||||
interrupts, which is a requirement for Intel SR-IOV.
|
||||
|
||||
If you wish to use the Intel 82576 or 82599-based controllers in SR-IOV mode
|
||||
with KVM and a Microsoft Windows Server 2008 guest try the following
|
||||
workaround. The workaround is to tell KVM to emulate a different model of CPU
|
||||
when using qemu to create the KVM guest:
|
||||
|
||||
"-cpu qemu64,model=13"
|
||||
|
||||
|
||||
Support
|
||||
=======
|
||||
|
||||
For general information, go to the Intel support website at:
|
||||
|
||||
http://support.intel.com
|
||||
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
|
||||
http://e1000.sourceforge.net
|
||||
|
||||
If an issue is identified with the released source code on the supported
|
||||
kernel with a supported adapter, email the specific information related
|
||||
to the issue to e1000-devel@lists.sf.net
|
||||
52
Documentation/networking/ixgbevf.txt
Normal file
52
Documentation/networking/ixgbevf.txt
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
Linux* Base Driver for Intel(R) Ethernet Network Connection
|
||||
===========================================================
|
||||
|
||||
Intel Gigabit Linux driver.
|
||||
Copyright(c) 1999 - 2013 Intel Corporation.
|
||||
|
||||
Contents
|
||||
========
|
||||
|
||||
- Identifying Your Adapter
|
||||
- Known Issues/Troubleshooting
|
||||
- Support
|
||||
|
||||
This file describes the ixgbevf Linux* Base Driver for Intel Network
|
||||
Connection.
|
||||
|
||||
The ixgbevf driver supports 82599-based virtual function devices that can only
|
||||
be activated on kernels with CONFIG_PCI_IOV enabled.
|
||||
|
||||
The ixgbevf driver supports virtual functions generated by the ixgbe driver
|
||||
with a max_vfs value of 1 or greater.
|
||||
|
||||
The guest OS loading the ixgbevf driver must support MSI-X interrupts.
|
||||
|
||||
VLANs: There is a limit of a total of 32 shared VLANs to 1 or more VFs.
|
||||
|
||||
Identifying Your Adapter
|
||||
========================
|
||||
|
||||
For more information on how to identify your adapter, go to the Adapter &
|
||||
Driver ID Guide at:
|
||||
|
||||
http://support.intel.com/support/go/network/adapter/idguide.htm
|
||||
|
||||
Known Issues/Troubleshooting
|
||||
============================
|
||||
|
||||
|
||||
Support
|
||||
=======
|
||||
|
||||
For general information, go to the Intel support website at:
|
||||
|
||||
http://support.intel.com
|
||||
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
|
||||
http://sourceforge.net/projects/e1000
|
||||
|
||||
If an issue is identified with the released source code on the supported
|
||||
kernel with a supported adapter, email the specific information related
|
||||
to the issue to e1000-devel@lists.sf.net
|
||||
348
Documentation/networking/l2tp.txt
Normal file
348
Documentation/networking/l2tp.txt
Normal file
|
|
@ -0,0 +1,348 @@
|
|||
This document describes how to use the kernel's L2TP drivers to
|
||||
provide L2TP functionality. L2TP is a protocol that tunnels one or
|
||||
more sessions over an IP tunnel. It is commonly used for VPNs
|
||||
(L2TP/IPSec) and by ISPs to tunnel subscriber PPP sessions over an IP
|
||||
network infrastructure. With L2TPv3, it is also useful as a Layer-2
|
||||
tunneling infrastructure.
|
||||
|
||||
Features
|
||||
========
|
||||
|
||||
L2TPv2 (PPP over L2TP (UDP tunnels)).
|
||||
L2TPv3 ethernet pseudowires.
|
||||
L2TPv3 PPP pseudowires.
|
||||
L2TPv3 IP encapsulation.
|
||||
Netlink sockets for L2TPv3 configuration management.
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
The original pppol2tp driver was introduced in 2.6.23 and provided
|
||||
L2TPv2 functionality (rfc2661). L2TPv2 is used to tunnel one or more PPP
|
||||
sessions over a UDP tunnel.
|
||||
|
||||
L2TPv3 (rfc3931) changes the protocol to allow different frame types
|
||||
to be passed over an L2TP tunnel by moving the PPP-specific parts of
|
||||
the protocol out of the core L2TP packet headers. Each frame type is
|
||||
known as a pseudowire type. Ethernet, PPP, HDLC, Frame Relay and ATM
|
||||
pseudowires for L2TP are defined in separate RFC standards. Another
|
||||
change for L2TPv3 is that it can be carried directly over IP with no
|
||||
UDP header (UDP is optional). It is also possible to create static
|
||||
unmanaged L2TPv3 tunnels manually without a control protocol
|
||||
(userspace daemon) to manage them.
|
||||
|
||||
To support L2TPv3, the original pppol2tp driver was split up to
|
||||
separate the L2TP and PPP functionality. Existing L2TPv2 userspace
|
||||
apps should be unaffected as the original pppol2tp sockets API is
|
||||
retained. L2TPv3, however, uses netlink to manage L2TPv3 tunnels and
|
||||
sessions.
|
||||
|
||||
Design
|
||||
======
|
||||
|
||||
The L2TP protocol separates control and data frames. The L2TP kernel
|
||||
drivers handle only L2TP data frames; control frames are always
|
||||
handled by userspace. L2TP control frames carry messages between L2TP
|
||||
clients/servers and are used to setup / teardown tunnels and
|
||||
sessions. An L2TP client or server is implemented in userspace.
|
||||
|
||||
Each L2TP tunnel is implemented using a UDP or L2TPIP socket; L2TPIP
|
||||
provides L2TPv3 IP encapsulation (no UDP) and is implemented using a
|
||||
new l2tpip socket family. The tunnel socket is typically created by
|
||||
userspace, though for unmanaged L2TPv3 tunnels, the socket can also be
|
||||
created by the kernel. Each L2TP session (pseudowire) gets a network
|
||||
interface instance. In the case of PPP, these interfaces are created
|
||||
indirectly by pppd using a pppol2tp socket. In the case of ethernet,
|
||||
the netdevice is created upon a netlink request to create an L2TPv3
|
||||
ethernet pseudowire.
|
||||
|
||||
For PPP, the PPPoL2TP driver, net/l2tp/l2tp_ppp.c, provides a
|
||||
mechanism by which PPP frames carried through an L2TP session are
|
||||
passed through the kernel's PPP subsystem. The standard PPP daemon,
|
||||
pppd, handles all PPP interaction with the peer. PPP network
|
||||
interfaces are created for each local PPP endpoint. The kernel's PPP
|
||||
subsystem arranges for PPP control frames to be delivered to pppd,
|
||||
while data frames are forwarded as usual.
|
||||
|
||||
For ethernet, the L2TPETH driver, net/l2tp/l2tp_eth.c, implements a
|
||||
netdevice driver, managing virtual ethernet devices, one per
|
||||
pseudowire. These interfaces can be managed using standard Linux tools
|
||||
such as "ip" and "ifconfig". If only IP frames are passed over the
|
||||
tunnel, the interface can be given an IP addresses of itself and its
|
||||
peer. If non-IP frames are to be passed over the tunnel, the interface
|
||||
can be added to a bridge using brctl. All L2TP datapath protocol
|
||||
functions are handled by the L2TP core driver.
|
||||
|
||||
Each tunnel and session within a tunnel is assigned a unique tunnel_id
|
||||
and session_id. These ids are carried in the L2TP header of every
|
||||
control and data packet. (Actually, in L2TPv3, the tunnel_id isn't
|
||||
present in data frames - it is inferred from the IP connection on
|
||||
which the packet was received.) The L2TP driver uses the ids to lookup
|
||||
internal tunnel and/or session contexts to determine how to handle the
|
||||
packet. Zero tunnel / session ids are treated specially - zero ids are
|
||||
never assigned to tunnels or sessions in the network. In the driver,
|
||||
the tunnel context keeps a reference to the tunnel UDP or L2TPIP
|
||||
socket. The session context holds data that lets the driver interface
|
||||
to the kernel's network frame type subsystems, i.e. PPP, ethernet.
|
||||
|
||||
Userspace Programming
|
||||
=====================
|
||||
|
||||
For L2TPv2, there are a number of requirements on the userspace L2TP
|
||||
daemon in order to use the pppol2tp driver.
|
||||
|
||||
1. Use a UDP socket per tunnel.
|
||||
|
||||
2. Create a single PPPoL2TP socket per tunnel bound to a special null
|
||||
session id. This is used only for communicating with the driver but
|
||||
must remain open while the tunnel is active. Opening this tunnel
|
||||
management socket causes the driver to mark the tunnel socket as an
|
||||
L2TP UDP encapsulation socket and flags it for use by the
|
||||
referenced tunnel id. This hooks up the UDP receive path via
|
||||
udp_encap_rcv() in net/ipv4/udp.c. PPP data frames are never passed
|
||||
in this special PPPoX socket.
|
||||
|
||||
3. Create a PPPoL2TP socket per L2TP session. This is typically done
|
||||
by starting pppd with the pppol2tp plugin and appropriate
|
||||
arguments. A PPPoL2TP tunnel management socket (Step 2) must be
|
||||
created before the first PPPoL2TP session socket is created.
|
||||
|
||||
When creating PPPoL2TP sockets, the application provides information
|
||||
to the driver about the socket in a socket connect() call. Source and
|
||||
destination tunnel and session ids are provided, as well as the file
|
||||
descriptor of a UDP socket. See struct pppol2tp_addr in
|
||||
include/linux/if_pppol2tp.h. Note that zero tunnel / session ids are
|
||||
treated specially. When creating the per-tunnel PPPoL2TP management
|
||||
socket in Step 2 above, zero source and destination session ids are
|
||||
specified, which tells the driver to prepare the supplied UDP file
|
||||
descriptor for use as an L2TP tunnel socket.
|
||||
|
||||
Userspace may control behavior of the tunnel or session using
|
||||
setsockopt and ioctl on the PPPoX socket. The following socket
|
||||
options are supported:-
|
||||
|
||||
DEBUG - bitmask of debug message categories. See below.
|
||||
SENDSEQ - 0 => don't send packets with sequence numbers
|
||||
1 => send packets with sequence numbers
|
||||
RECVSEQ - 0 => receive packet sequence numbers are optional
|
||||
1 => drop receive packets without sequence numbers
|
||||
LNSMODE - 0 => act as LAC.
|
||||
1 => act as LNS.
|
||||
REORDERTO - reorder timeout (in millisecs). If 0, don't try to reorder.
|
||||
|
||||
Only the DEBUG option is supported by the special tunnel management
|
||||
PPPoX socket.
|
||||
|
||||
In addition to the standard PPP ioctls, a PPPIOCGL2TPSTATS is provided
|
||||
to retrieve tunnel and session statistics from the kernel using the
|
||||
PPPoX socket of the appropriate tunnel or session.
|
||||
|
||||
For L2TPv3, userspace must use the netlink API defined in
|
||||
include/linux/l2tp.h to manage tunnel and session contexts. The
|
||||
general procedure to create a new L2TP tunnel with one session is:-
|
||||
|
||||
1. Open a GENL socket using L2TP_GENL_NAME for configuring the kernel
|
||||
using netlink.
|
||||
|
||||
2. Create a UDP or L2TPIP socket for the tunnel.
|
||||
|
||||
3. Create a new L2TP tunnel using a L2TP_CMD_TUNNEL_CREATE
|
||||
request. Set attributes according to desired tunnel parameters,
|
||||
referencing the UDP or L2TPIP socket created in the previous step.
|
||||
|
||||
4. Create a new L2TP session in the tunnel using a
|
||||
L2TP_CMD_SESSION_CREATE request.
|
||||
|
||||
The tunnel and all of its sessions are closed when the tunnel socket
|
||||
is closed. The netlink API may also be used to delete sessions and
|
||||
tunnels. Configuration and status info may be set or read using netlink.
|
||||
|
||||
The L2TP driver also supports static (unmanaged) L2TPv3 tunnels. These
|
||||
are where there is no L2TP control message exchange with the peer to
|
||||
setup the tunnel; the tunnel is configured manually at each end of the
|
||||
tunnel. There is no need for an L2TP userspace application in this
|
||||
case -- the tunnel socket is created by the kernel and configured
|
||||
using parameters sent in the L2TP_CMD_TUNNEL_CREATE netlink
|
||||
request. The "ip" utility of iproute2 has commands for managing static
|
||||
L2TPv3 tunnels; do "ip l2tp help" for more information.
|
||||
|
||||
Debugging
|
||||
=========
|
||||
|
||||
The driver supports a flexible debug scheme where kernel trace
|
||||
messages may be optionally enabled per tunnel and per session. Care is
|
||||
needed when debugging a live system since the messages are not
|
||||
rate-limited and a busy system could be swamped. Userspace uses
|
||||
setsockopt on the PPPoX socket to set a debug mask.
|
||||
|
||||
The following debug mask bits are available:
|
||||
|
||||
PPPOL2TP_MSG_DEBUG verbose debug (if compiled in)
|
||||
PPPOL2TP_MSG_CONTROL userspace - kernel interface
|
||||
PPPOL2TP_MSG_SEQ sequence numbers handling
|
||||
PPPOL2TP_MSG_DATA data packets
|
||||
|
||||
If enabled, files under a l2tp debugfs directory can be used to dump
|
||||
kernel state about L2TP tunnels and sessions. To access it, the
|
||||
debugfs filesystem must first be mounted.
|
||||
|
||||
# mount -t debugfs debugfs /debug
|
||||
|
||||
Files under the l2tp directory can then be accessed.
|
||||
|
||||
# cat /debug/l2tp/tunnels
|
||||
|
||||
The debugfs files should not be used by applications to obtain L2TP
|
||||
state information because the file format is subject to change. It is
|
||||
implemented to provide extra debug information to help diagnose
|
||||
problems.) Users should use the netlink API.
|
||||
|
||||
/proc/net/pppol2tp is also provided for backwards compatibility with
|
||||
the original pppol2tp driver. It lists information about L2TPv2
|
||||
tunnels and sessions only. Its use is discouraged.
|
||||
|
||||
Unmanaged L2TPv3 Tunnels
|
||||
========================
|
||||
|
||||
Some commercial L2TP products support unmanaged L2TPv3 ethernet
|
||||
tunnels, where there is no L2TP control protocol; tunnels are
|
||||
configured at each side manually. New commands are available in
|
||||
iproute2's ip utility to support this.
|
||||
|
||||
To create an L2TPv3 ethernet pseudowire between local host 192.168.1.1
|
||||
and peer 192.168.1.2, using IP addresses 10.5.1.1 and 10.5.1.2 for the
|
||||
tunnel endpoints:-
|
||||
|
||||
# modprobe l2tp_eth
|
||||
# modprobe l2tp_netlink
|
||||
|
||||
# ip l2tp add tunnel tunnel_id 1 peer_tunnel_id 1 udp_sport 5000 \
|
||||
udp_dport 5000 encap udp local 192.168.1.1 remote 192.168.1.2
|
||||
# ip l2tp add session tunnel_id 1 session_id 1 peer_session_id 1
|
||||
# ifconfig -a
|
||||
# ip addr add 10.5.1.2/32 peer 10.5.1.1/32 dev l2tpeth0
|
||||
# ifconfig l2tpeth0 up
|
||||
|
||||
Choose IP addresses to be the address of a local IP interface and that
|
||||
of the remote system. The IP addresses of the l2tpeth0 interface can be
|
||||
anything suitable.
|
||||
|
||||
Repeat the above at the peer, with ports, tunnel/session ids and IP
|
||||
addresses reversed. The tunnel and session IDs can be any non-zero
|
||||
32-bit number, but the values must be reversed at the peer.
|
||||
|
||||
Host 1 Host2
|
||||
udp_sport=5000 udp_sport=5001
|
||||
udp_dport=5001 udp_dport=5000
|
||||
tunnel_id=42 tunnel_id=45
|
||||
peer_tunnel_id=45 peer_tunnel_id=42
|
||||
session_id=128 session_id=5196755
|
||||
peer_session_id=5196755 peer_session_id=128
|
||||
|
||||
When done at both ends of the tunnel, it should be possible to send
|
||||
data over the network. e.g.
|
||||
|
||||
# ping 10.5.1.1
|
||||
|
||||
|
||||
Sample Userspace Code
|
||||
=====================
|
||||
|
||||
1. Create tunnel management PPPoX socket
|
||||
|
||||
kernel_fd = socket(AF_PPPOX, SOCK_DGRAM, PX_PROTO_OL2TP);
|
||||
if (kernel_fd >= 0) {
|
||||
struct sockaddr_pppol2tp sax;
|
||||
struct sockaddr_in const *peer_addr;
|
||||
|
||||
peer_addr = l2tp_tunnel_get_peer_addr(tunnel);
|
||||
memset(&sax, 0, sizeof(sax));
|
||||
sax.sa_family = AF_PPPOX;
|
||||
sax.sa_protocol = PX_PROTO_OL2TP;
|
||||
sax.pppol2tp.fd = udp_fd; /* fd of tunnel UDP socket */
|
||||
sax.pppol2tp.addr.sin_addr.s_addr = peer_addr->sin_addr.s_addr;
|
||||
sax.pppol2tp.addr.sin_port = peer_addr->sin_port;
|
||||
sax.pppol2tp.addr.sin_family = AF_INET;
|
||||
sax.pppol2tp.s_tunnel = tunnel_id;
|
||||
sax.pppol2tp.s_session = 0; /* special case: mgmt socket */
|
||||
sax.pppol2tp.d_tunnel = 0;
|
||||
sax.pppol2tp.d_session = 0; /* special case: mgmt socket */
|
||||
|
||||
if(connect(kernel_fd, (struct sockaddr *)&sax, sizeof(sax) ) < 0 ) {
|
||||
perror("connect failed");
|
||||
result = -errno;
|
||||
goto err;
|
||||
}
|
||||
}
|
||||
|
||||
2. Create session PPPoX data socket
|
||||
|
||||
struct sockaddr_pppol2tp sax;
|
||||
int fd;
|
||||
|
||||
/* Note, the target socket must be bound already, else it will not be ready */
|
||||
sax.sa_family = AF_PPPOX;
|
||||
sax.sa_protocol = PX_PROTO_OL2TP;
|
||||
sax.pppol2tp.fd = tunnel_fd;
|
||||
sax.pppol2tp.addr.sin_addr.s_addr = addr->sin_addr.s_addr;
|
||||
sax.pppol2tp.addr.sin_port = addr->sin_port;
|
||||
sax.pppol2tp.addr.sin_family = AF_INET;
|
||||
sax.pppol2tp.s_tunnel = tunnel_id;
|
||||
sax.pppol2tp.s_session = session_id;
|
||||
sax.pppol2tp.d_tunnel = peer_tunnel_id;
|
||||
sax.pppol2tp.d_session = peer_session_id;
|
||||
|
||||
/* session_fd is the fd of the session's PPPoL2TP socket.
|
||||
* tunnel_fd is the fd of the tunnel UDP socket.
|
||||
*/
|
||||
fd = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax));
|
||||
if (fd < 0 ) {
|
||||
return -errno;
|
||||
}
|
||||
return 0;
|
||||
|
||||
Internal Implementation
|
||||
=======================
|
||||
|
||||
The driver keeps a struct l2tp_tunnel context per L2TP tunnel and a
|
||||
struct l2tp_session context for each session. The l2tp_tunnel is
|
||||
always associated with a UDP or L2TP/IP socket and keeps a list of
|
||||
sessions in the tunnel. The l2tp_session context keeps kernel state
|
||||
about the session. It has private data which is used for data specific
|
||||
to the session type. With L2TPv2, the session always carried PPP
|
||||
traffic. With L2TPv3, the session can also carry ethernet frames
|
||||
(ethernet pseudowire) or other data types such as ATM, HDLC or Frame
|
||||
Relay.
|
||||
|
||||
When a tunnel is first opened, the reference count on the socket is
|
||||
increased using sock_hold(). This ensures that the kernel socket
|
||||
cannot be removed while L2TP's data structures reference it.
|
||||
|
||||
Some L2TP sessions also have a socket (PPP pseudowires) while others
|
||||
do not (ethernet pseudowires). We can't use the socket reference count
|
||||
as the reference count for session contexts. The L2TP implementation
|
||||
therefore has its own internal reference counts on the session
|
||||
contexts.
|
||||
|
||||
To Do
|
||||
=====
|
||||
|
||||
Add L2TP tunnel switching support. This would route tunneled traffic
|
||||
from one L2TP tunnel into another. Specified in
|
||||
http://tools.ietf.org/html/draft-ietf-l2tpext-tunnel-switching-08
|
||||
|
||||
Add L2TPv3 VLAN pseudowire support.
|
||||
|
||||
Add L2TPv3 IP pseudowire support.
|
||||
|
||||
Add L2TPv3 ATM pseudowire support.
|
||||
|
||||
Miscellaneous
|
||||
=============
|
||||
|
||||
The L2TP drivers were developed as part of the OpenL2TP project by
|
||||
Katalix Systems Ltd. OpenL2TP is a full-featured L2TP client / server,
|
||||
designed from the ground up to have the L2TP datapath in the
|
||||
kernel. The project also implemented the pppol2tp plugin for pppd
|
||||
which allows pppd to use the kernel driver. Details can be found at
|
||||
http://www.openl2tp.org.
|
||||
263
Documentation/networking/lapb-module.txt
Normal file
263
Documentation/networking/lapb-module.txt
Normal file
|
|
@ -0,0 +1,263 @@
|
|||
The Linux LAPB Module Interface 1.3
|
||||
|
||||
Jonathan Naylor 29.12.96
|
||||
|
||||
Changed (Henner Eisen, 2000-10-29): int return value for data_indication()
|
||||
|
||||
The LAPB module will be a separately compiled module for use by any parts of
|
||||
the Linux operating system that require a LAPB service. This document
|
||||
defines the interfaces to, and the services provided by this module. The
|
||||
term module in this context does not imply that the LAPB module is a
|
||||
separately loadable module, although it may be. The term module is used in
|
||||
its more standard meaning.
|
||||
|
||||
The interface to the LAPB module consists of functions to the module,
|
||||
callbacks from the module to indicate important state changes, and
|
||||
structures for getting and setting information about the module.
|
||||
|
||||
Structures
|
||||
----------
|
||||
|
||||
Probably the most important structure is the skbuff structure for holding
|
||||
received and transmitted data, however it is beyond the scope of this
|
||||
document.
|
||||
|
||||
The two LAPB specific structures are the LAPB initialisation structure and
|
||||
the LAPB parameter structure. These will be defined in a standard header
|
||||
file, <linux/lapb.h>. The header file <net/lapb.h> is internal to the LAPB
|
||||
module and is not for use.
|
||||
|
||||
LAPB Initialisation Structure
|
||||
-----------------------------
|
||||
|
||||
This structure is used only once, in the call to lapb_register (see below).
|
||||
It contains information about the device driver that requires the services
|
||||
of the LAPB module.
|
||||
|
||||
struct lapb_register_struct {
|
||||
void (*connect_confirmation)(int token, int reason);
|
||||
void (*connect_indication)(int token, int reason);
|
||||
void (*disconnect_confirmation)(int token, int reason);
|
||||
void (*disconnect_indication)(int token, int reason);
|
||||
int (*data_indication)(int token, struct sk_buff *skb);
|
||||
void (*data_transmit)(int token, struct sk_buff *skb);
|
||||
};
|
||||
|
||||
Each member of this structure corresponds to a function in the device driver
|
||||
that is called when a particular event in the LAPB module occurs. These will
|
||||
be described in detail below. If a callback is not required (!!) then a NULL
|
||||
may be substituted.
|
||||
|
||||
|
||||
LAPB Parameter Structure
|
||||
------------------------
|
||||
|
||||
This structure is used with the lapb_getparms and lapb_setparms functions
|
||||
(see below). They are used to allow the device driver to get and set the
|
||||
operational parameters of the LAPB implementation for a given connection.
|
||||
|
||||
struct lapb_parms_struct {
|
||||
unsigned int t1;
|
||||
unsigned int t1timer;
|
||||
unsigned int t2;
|
||||
unsigned int t2timer;
|
||||
unsigned int n2;
|
||||
unsigned int n2count;
|
||||
unsigned int window;
|
||||
unsigned int state;
|
||||
unsigned int mode;
|
||||
};
|
||||
|
||||
T1 and T2 are protocol timing parameters and are given in units of 100ms. N2
|
||||
is the maximum number of tries on the link before it is declared a failure.
|
||||
The window size is the maximum number of outstanding data packets allowed to
|
||||
be unacknowledged by the remote end, the value of the window is between 1
|
||||
and 7 for a standard LAPB link, and between 1 and 127 for an extended LAPB
|
||||
link.
|
||||
|
||||
The mode variable is a bit field used for setting (at present) three values.
|
||||
The bit fields have the following meanings:
|
||||
|
||||
Bit Meaning
|
||||
0 LAPB operation (0=LAPB_STANDARD 1=LAPB_EXTENDED).
|
||||
1 [SM]LP operation (0=LAPB_SLP 1=LAPB=MLP).
|
||||
2 DTE/DCE operation (0=LAPB_DTE 1=LAPB_DCE)
|
||||
3-31 Reserved, must be 0.
|
||||
|
||||
Extended LAPB operation indicates the use of extended sequence numbers and
|
||||
consequently larger window sizes, the default is standard LAPB operation.
|
||||
MLP operation is the same as SLP operation except that the addresses used by
|
||||
LAPB are different to indicate the mode of operation, the default is Single
|
||||
Link Procedure. The difference between DCE and DTE operation is (i) the
|
||||
addresses used for commands and responses, and (ii) when the DCE is not
|
||||
connected, it sends DM without polls set, every T1. The upper case constant
|
||||
names will be defined in the public LAPB header file.
|
||||
|
||||
|
||||
Functions
|
||||
---------
|
||||
|
||||
The LAPB module provides a number of function entry points.
|
||||
|
||||
|
||||
int lapb_register(void *token, struct lapb_register_struct);
|
||||
|
||||
This must be called before the LAPB module may be used. If the call is
|
||||
successful then LAPB_OK is returned. The token must be a unique identifier
|
||||
generated by the device driver to allow for the unique identification of the
|
||||
instance of the LAPB link. It is returned by the LAPB module in all of the
|
||||
callbacks, and is used by the device driver in all calls to the LAPB module.
|
||||
For multiple LAPB links in a single device driver, multiple calls to
|
||||
lapb_register must be made. The format of the lapb_register_struct is given
|
||||
above. The return values are:
|
||||
|
||||
LAPB_OK LAPB registered successfully.
|
||||
LAPB_BADTOKEN Token is already registered.
|
||||
LAPB_NOMEM Out of memory
|
||||
|
||||
|
||||
int lapb_unregister(void *token);
|
||||
|
||||
This releases all the resources associated with a LAPB link. Any current
|
||||
LAPB link will be abandoned without further messages being passed. After
|
||||
this call, the value of token is no longer valid for any calls to the LAPB
|
||||
function. The valid return values are:
|
||||
|
||||
LAPB_OK LAPB unregistered successfully.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
|
||||
|
||||
int lapb_getparms(void *token, struct lapb_parms_struct *parms);
|
||||
|
||||
This allows the device driver to get the values of the current LAPB
|
||||
variables, the lapb_parms_struct is described above. The valid return values
|
||||
are:
|
||||
|
||||
LAPB_OK LAPB getparms was successful.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
|
||||
|
||||
int lapb_setparms(void *token, struct lapb_parms_struct *parms);
|
||||
|
||||
This allows the device driver to set the values of the current LAPB
|
||||
variables, the lapb_parms_struct is described above. The values of t1timer,
|
||||
t2timer and n2count are ignored, likewise changing the mode bits when
|
||||
connected will be ignored. An error implies that none of the values have
|
||||
been changed. The valid return values are:
|
||||
|
||||
LAPB_OK LAPB getparms was successful.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
LAPB_INVALUE One of the values was out of its allowable range.
|
||||
|
||||
|
||||
int lapb_connect_request(void *token);
|
||||
|
||||
Initiate a connect using the current parameter settings. The valid return
|
||||
values are:
|
||||
|
||||
LAPB_OK LAPB is starting to connect.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
LAPB_CONNECTED LAPB module is already connected.
|
||||
|
||||
|
||||
int lapb_disconnect_request(void *token);
|
||||
|
||||
Initiate a disconnect. The valid return values are:
|
||||
|
||||
LAPB_OK LAPB is starting to disconnect.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
LAPB_NOTCONNECTED LAPB module is not connected.
|
||||
|
||||
|
||||
int lapb_data_request(void *token, struct sk_buff *skb);
|
||||
|
||||
Queue data with the LAPB module for transmitting over the link. If the call
|
||||
is successful then the skbuff is owned by the LAPB module and may not be
|
||||
used by the device driver again. The valid return values are:
|
||||
|
||||
LAPB_OK LAPB has accepted the data.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
LAPB_NOTCONNECTED LAPB module is not connected.
|
||||
|
||||
|
||||
int lapb_data_received(void *token, struct sk_buff *skb);
|
||||
|
||||
Queue data with the LAPB module which has been received from the device. It
|
||||
is expected that the data passed to the LAPB module has skb->data pointing
|
||||
to the beginning of the LAPB data. If the call is successful then the skbuff
|
||||
is owned by the LAPB module and may not be used by the device driver again.
|
||||
The valid return values are:
|
||||
|
||||
LAPB_OK LAPB has accepted the data.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
|
||||
|
||||
Callbacks
|
||||
---------
|
||||
|
||||
These callbacks are functions provided by the device driver for the LAPB
|
||||
module to call when an event occurs. They are registered with the LAPB
|
||||
module with lapb_register (see above) in the structure lapb_register_struct
|
||||
(see above).
|
||||
|
||||
|
||||
void (*connect_confirmation)(void *token, int reason);
|
||||
|
||||
This is called by the LAPB module when a connection is established after
|
||||
being requested by a call to lapb_connect_request (see above). The reason is
|
||||
always LAPB_OK.
|
||||
|
||||
|
||||
void (*connect_indication)(void *token, int reason);
|
||||
|
||||
This is called by the LAPB module when the link is established by the remote
|
||||
system. The value of reason is always LAPB_OK.
|
||||
|
||||
|
||||
void (*disconnect_confirmation)(void *token, int reason);
|
||||
|
||||
This is called by the LAPB module when an event occurs after the device
|
||||
driver has called lapb_disconnect_request (see above). The reason indicates
|
||||
what has happened. In all cases the LAPB link can be regarded as being
|
||||
terminated. The values for reason are:
|
||||
|
||||
LAPB_OK The LAPB link was terminated normally.
|
||||
LAPB_NOTCONNECTED The remote system was not connected.
|
||||
LAPB_TIMEDOUT No response was received in N2 tries from the remote
|
||||
system.
|
||||
|
||||
|
||||
void (*disconnect_indication)(void *token, int reason);
|
||||
|
||||
This is called by the LAPB module when the link is terminated by the remote
|
||||
system or another event has occurred to terminate the link. This may be
|
||||
returned in response to a lapb_connect_request (see above) if the remote
|
||||
system refused the request. The values for reason are:
|
||||
|
||||
LAPB_OK The LAPB link was terminated normally by the remote
|
||||
system.
|
||||
LAPB_REFUSED The remote system refused the connect request.
|
||||
LAPB_NOTCONNECTED The remote system was not connected.
|
||||
LAPB_TIMEDOUT No response was received in N2 tries from the remote
|
||||
system.
|
||||
|
||||
|
||||
int (*data_indication)(void *token, struct sk_buff *skb);
|
||||
|
||||
This is called by the LAPB module when data has been received from the
|
||||
remote system that should be passed onto the next layer in the protocol
|
||||
stack. The skbuff becomes the property of the device driver and the LAPB
|
||||
module will not perform any more actions on it. The skb->data pointer will
|
||||
be pointing to the first byte of data after the LAPB header.
|
||||
|
||||
This method should return NET_RX_DROP (as defined in the header
|
||||
file include/linux/netdevice.h) if and only if the frame was dropped
|
||||
before it could be delivered to the upper layer.
|
||||
|
||||
|
||||
void (*data_transmit)(void *token, struct sk_buff *skb);
|
||||
|
||||
This is called by the LAPB module when data is to be transmitted to the
|
||||
remote system by the device driver. The skbuff becomes the property of the
|
||||
device driver and the LAPB module will not perform any more actions on it.
|
||||
The skb->data pointer will be pointing to the first byte of the LAPB header.
|
||||
131
Documentation/networking/ltpc.txt
Normal file
131
Documentation/networking/ltpc.txt
Normal file
|
|
@ -0,0 +1,131 @@
|
|||
This is the ALPHA version of the ltpc driver.
|
||||
|
||||
In order to use it, you will need at least version 1.3.3 of the
|
||||
netatalk package, and the Apple or Farallon LocalTalk PC card.
|
||||
There are a number of different LocalTalk cards for the PC; this
|
||||
driver applies only to the one with the 65c02 processor chip on it.
|
||||
|
||||
To include it in the kernel, select the CONFIG_LTPC switch in the
|
||||
configuration dialog. You can also compile it as a module.
|
||||
|
||||
While the driver will attempt to autoprobe the I/O port address, IRQ
|
||||
line, and DMA channel of the card, this does not always work. For
|
||||
this reason, you should be prepared to supply these parameters
|
||||
yourself. (see "Card Configuration" below for how to determine or
|
||||
change the settings on your card)
|
||||
|
||||
When the driver is compiled into the kernel, you can add a line such
|
||||
as the following to your /etc/lilo.conf:
|
||||
|
||||
append="ltpc=0x240,9,1"
|
||||
|
||||
where the parameters (in order) are the port address, IRQ, and DMA
|
||||
channel. The second and third values can be omitted, in which case
|
||||
the driver will try to determine them itself.
|
||||
|
||||
If you load the driver as a module, you can pass the parameters "io=",
|
||||
"irq=", and "dma=" on the command line with insmod or modprobe, or add
|
||||
them as options in a configuration file in /etc/modprobe.d/ directory:
|
||||
|
||||
alias lt0 ltpc # autoload the module when the interface is configured
|
||||
options ltpc io=0x240 irq=9 dma=1
|
||||
|
||||
Before starting up the netatalk demons (perhaps in rc.local), you
|
||||
need to add a line such as:
|
||||
|
||||
/sbin/ifconfig lt0 127.0.0.42
|
||||
|
||||
The address is unimportant - however, the card needs to be configured
|
||||
with ifconfig so that Netatalk can find it.
|
||||
|
||||
The appropriate netatalk configuration depends on whether you are
|
||||
attached to a network that includes AppleTalk routers or not. If,
|
||||
like me, you are simply connecting to your home Macintoshes and
|
||||
printers, you need to set up netatalk to "seed". The way I do this
|
||||
is to have the lines
|
||||
|
||||
dummy -seed -phase 2 -net 2000 -addr 2000.26 -zone "1033"
|
||||
lt0 -seed -phase 1 -net 1033 -addr 1033.27 -zone "1033"
|
||||
|
||||
in my atalkd.conf. What is going on here is that I need to fool
|
||||
netatalk into thinking that there are two AppleTalk interfaces
|
||||
present; otherwise, it refuses to seed. This is a hack, and a more
|
||||
permanent solution would be to alter the netatalk code. Also, make
|
||||
sure you have the correct name for the dummy interface - If it's
|
||||
compiled as a module, you will need to refer to it as "dummy0" or some
|
||||
such.
|
||||
|
||||
If you are attached to an extended AppleTalk network, with routers on
|
||||
it, then you don't need to fool around with this -- the appropriate
|
||||
line in atalkd.conf is
|
||||
|
||||
lt0 -phase 1
|
||||
|
||||
--------------------------------------
|
||||
|
||||
Card Configuration:
|
||||
|
||||
The interrupts and so forth are configured via the dipswitch on the
|
||||
board. Set the switches so as not to conflict with other hardware.
|
||||
|
||||
Interrupts -- set at most one. If none are set, the driver uses
|
||||
polled mode. Because the card was developed in the XT era, the
|
||||
original documentation refers to IRQ2. Since you'll be running
|
||||
this on an AT (or later) class machine, that really means IRQ9.
|
||||
|
||||
SW1 IRQ 4
|
||||
SW2 IRQ 3
|
||||
SW3 IRQ 9 (2 in original card documentation only applies to XT)
|
||||
|
||||
|
||||
DMA -- choose DMA 1 or 3, and set both corresponding switches.
|
||||
|
||||
SW4 DMA 3
|
||||
SW5 DMA 1
|
||||
SW6 DMA 3
|
||||
SW7 DMA 1
|
||||
|
||||
|
||||
I/O address -- choose one.
|
||||
|
||||
SW8 220 / 240
|
||||
|
||||
--------------------------------------
|
||||
|
||||
IP:
|
||||
|
||||
Yes, it is possible to do IP over LocalTalk. However, you can't just
|
||||
treat the LocalTalk device like an ordinary Ethernet device, even if
|
||||
that's what it looks like to Netatalk.
|
||||
|
||||
Instead, you follow the same procedure as for doing IP in EtherTalk.
|
||||
See Documentation/networking/ipddp.txt for more information about the
|
||||
kernel driver and userspace tools needed.
|
||||
|
||||
--------------------------------------
|
||||
|
||||
BUGS:
|
||||
|
||||
IRQ autoprobing often doesn't work on a cold boot. To get around
|
||||
this, either compile the driver as a module, or pass the parameters
|
||||
for the card to the kernel as described above.
|
||||
|
||||
Also, as usual, autoprobing is not recommended when you use the driver
|
||||
as a module. (though it usually works at boot time, at least)
|
||||
|
||||
Polled mode is *really* slow sometimes, but this seems to depend on
|
||||
the configuration of the network.
|
||||
|
||||
It may theoretically be possible to use two LTPC cards in the same
|
||||
machine, but this is unsupported, so if you really want to do this,
|
||||
you'll probably have to hack the initialization code a bit.
|
||||
|
||||
______________________________________
|
||||
|
||||
THANKS:
|
||||
Thanks to Alan Cox for helpful discussions early on in this
|
||||
work, and to Denis Hainsworth for doing the bleeding-edge testing.
|
||||
|
||||
-- Bradford Johnson <bradford@math.umn.edu>
|
||||
|
||||
-- Updated 11/09/1998 by David Huggins-Daines <dhd@debian.org>
|
||||
95
Documentation/networking/mac80211-auth-assoc-deauth.txt
Normal file
95
Documentation/networking/mac80211-auth-assoc-deauth.txt
Normal file
|
|
@ -0,0 +1,95 @@
|
|||
#
|
||||
# This outlines the Linux authentication/association and
|
||||
# deauthentication/disassociation flows.
|
||||
#
|
||||
# This can be converted into a diagram using the service
|
||||
# at http://www.websequencediagrams.com/
|
||||
#
|
||||
|
||||
participant userspace
|
||||
participant mac80211
|
||||
participant driver
|
||||
|
||||
alt authentication needed (not FT)
|
||||
userspace->mac80211: authenticate
|
||||
|
||||
alt authenticated/authenticating already
|
||||
mac80211->driver: sta_state(AP, not-exists)
|
||||
mac80211->driver: bss_info_changed(clear BSSID)
|
||||
else associated
|
||||
note over mac80211,driver
|
||||
like deauth/disassoc, without sending the
|
||||
BA session stop & deauth/disassoc frames
|
||||
end note
|
||||
end
|
||||
|
||||
mac80211->driver: config(channel, channel type)
|
||||
mac80211->driver: bss_info_changed(set BSSID, basic rate bitmap)
|
||||
mac80211->driver: sta_state(AP, exists)
|
||||
|
||||
alt no probe request data known
|
||||
mac80211->driver: TX directed probe request
|
||||
driver->mac80211: RX probe response
|
||||
end
|
||||
|
||||
mac80211->driver: TX auth frame
|
||||
driver->mac80211: RX auth frame
|
||||
|
||||
alt WEP shared key auth
|
||||
mac80211->driver: TX auth frame
|
||||
driver->mac80211: RX auth frame
|
||||
end
|
||||
|
||||
mac80211->driver: sta_state(AP, authenticated)
|
||||
mac80211->userspace: RX auth frame
|
||||
|
||||
end
|
||||
|
||||
userspace->mac80211: associate
|
||||
alt authenticated or associated
|
||||
note over mac80211,driver: cleanup like for authenticate
|
||||
end
|
||||
|
||||
alt not previously authenticated (FT)
|
||||
mac80211->driver: config(channel, channel type)
|
||||
mac80211->driver: bss_info_changed(set BSSID, basic rate bitmap)
|
||||
mac80211->driver: sta_state(AP, exists)
|
||||
mac80211->driver: sta_state(AP, authenticated)
|
||||
end
|
||||
mac80211->driver: TX assoc
|
||||
driver->mac80211: RX assoc response
|
||||
note over mac80211: init rate control
|
||||
mac80211->driver: sta_state(AP, associated)
|
||||
|
||||
alt not using WPA
|
||||
mac80211->driver: sta_state(AP, authorized)
|
||||
end
|
||||
|
||||
mac80211->driver: set up QoS parameters
|
||||
|
||||
mac80211->driver: bss_info_changed(QoS, HT, associated with AID)
|
||||
mac80211->userspace: associated
|
||||
|
||||
note left of userspace: associated now
|
||||
|
||||
alt using WPA
|
||||
note over userspace
|
||||
do 4-way-handshake
|
||||
(data frames)
|
||||
end note
|
||||
userspace->mac80211: authorized
|
||||
mac80211->driver: sta_state(AP, authorized)
|
||||
end
|
||||
|
||||
userspace->mac80211: deauthenticate/disassociate
|
||||
mac80211->driver: stop BA sessions
|
||||
mac80211->driver: TX deauth/disassoc
|
||||
mac80211->driver: flush frames
|
||||
mac80211->driver: sta_state(AP,associated)
|
||||
mac80211->driver: sta_state(AP,authenticated)
|
||||
mac80211->driver: sta_state(AP,exists)
|
||||
mac80211->driver: sta_state(AP,not-exists)
|
||||
mac80211->driver: turn off powersave
|
||||
mac80211->driver: bss_info_changed(clear BSSID, not associated, no QoS, ...)
|
||||
mac80211->driver: config(channel type to non-HT)
|
||||
mac80211->userspace: disconnected
|
||||
67
Documentation/networking/mac80211-injection.txt
Normal file
67
Documentation/networking/mac80211-injection.txt
Normal file
|
|
@ -0,0 +1,67 @@
|
|||
How to use packet injection with mac80211
|
||||
=========================================
|
||||
|
||||
mac80211 now allows arbitrary packets to be injected down any Monitor Mode
|
||||
interface from userland. The packet you inject needs to be composed in the
|
||||
following format:
|
||||
|
||||
[ radiotap header ]
|
||||
[ ieee80211 header ]
|
||||
[ payload ]
|
||||
|
||||
The radiotap format is discussed in
|
||||
./Documentation/networking/radiotap-headers.txt.
|
||||
|
||||
Despite many radiotap parameters being currently defined, most only make sense
|
||||
to appear on received packets. The following information is parsed from the
|
||||
radiotap headers and used to control injection:
|
||||
|
||||
* IEEE80211_RADIOTAP_FLAGS
|
||||
|
||||
IEEE80211_RADIOTAP_F_FCS: FCS will be removed and recalculated
|
||||
IEEE80211_RADIOTAP_F_WEP: frame will be encrypted if key available
|
||||
IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the
|
||||
current fragmentation threshold.
|
||||
|
||||
* IEEE80211_RADIOTAP_TX_FLAGS
|
||||
|
||||
IEEE80211_RADIOTAP_F_TX_NOACK: frame should be sent without waiting for
|
||||
an ACK even if it is a unicast frame
|
||||
|
||||
The injection code can also skip all other currently defined radiotap fields
|
||||
facilitating replay of captured radiotap headers directly.
|
||||
|
||||
Here is an example valid radiotap header defining some parameters
|
||||
|
||||
0x00, 0x00, // <-- radiotap version
|
||||
0x0b, 0x00, // <- radiotap header length
|
||||
0x04, 0x0c, 0x00, 0x00, // <-- bitmap
|
||||
0x6c, // <-- rate
|
||||
0x0c, //<-- tx power
|
||||
0x01 //<-- antenna
|
||||
|
||||
The ieee80211 header follows immediately afterwards, looking for example like
|
||||
this:
|
||||
|
||||
0x08, 0x01, 0x00, 0x00,
|
||||
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
|
||||
0x13, 0x22, 0x33, 0x44, 0x55, 0x66,
|
||||
0x13, 0x22, 0x33, 0x44, 0x55, 0x66,
|
||||
0x10, 0x86
|
||||
|
||||
Then lastly there is the payload.
|
||||
|
||||
After composing the packet contents, it is sent by send()-ing it to a logical
|
||||
mac80211 interface that is in Monitor mode. Libpcap can also be used,
|
||||
(which is easier than doing the work to bind the socket to the right
|
||||
interface), along the following lines:
|
||||
|
||||
ppcap = pcap_open_live(szInterfaceName, 800, 1, 20, szErrbuf);
|
||||
...
|
||||
r = pcap_inject(ppcap, u8aSendBuffer, nLength);
|
||||
|
||||
You can also find a link to a complete inject application here:
|
||||
|
||||
http://wireless.kernel.org/en/users/Documentation/packetspammer
|
||||
|
||||
Andy Green <andy@warmcat.com>
|
||||
68
Documentation/networking/mac80211_hwsim/README
Normal file
68
Documentation/networking/mac80211_hwsim/README
Normal file
|
|
@ -0,0 +1,68 @@
|
|||
mac80211_hwsim - software simulator of 802.11 radio(s) for mac80211
|
||||
Copyright (c) 2008, Jouni Malinen <j@w1.fi>
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License version 2 as
|
||||
published by the Free Software Foundation.
|
||||
|
||||
|
||||
Introduction
|
||||
|
||||
mac80211_hwsim is a Linux kernel module that can be used to simulate
|
||||
arbitrary number of IEEE 802.11 radios for mac80211. It can be used to
|
||||
test most of the mac80211 functionality and user space tools (e.g.,
|
||||
hostapd and wpa_supplicant) in a way that matches very closely with
|
||||
the normal case of using real WLAN hardware. From the mac80211 view
|
||||
point, mac80211_hwsim is yet another hardware driver, i.e., no changes
|
||||
to mac80211 are needed to use this testing tool.
|
||||
|
||||
The main goal for mac80211_hwsim is to make it easier for developers
|
||||
to test their code and work with new features to mac80211, hostapd,
|
||||
and wpa_supplicant. The simulated radios do not have the limitations
|
||||
of real hardware, so it is easy to generate an arbitrary test setup
|
||||
and always reproduce the same setup for future tests. In addition,
|
||||
since all radio operation is simulated, any channel can be used in
|
||||
tests regardless of regulatory rules.
|
||||
|
||||
mac80211_hwsim kernel module has a parameter 'radios' that can be used
|
||||
to select how many radios are simulated (default 2). This allows
|
||||
configuration of both very simply setups (e.g., just a single access
|
||||
point and a station) or large scale tests (multiple access points with
|
||||
hundreds of stations).
|
||||
|
||||
mac80211_hwsim works by tracking the current channel of each virtual
|
||||
radio and copying all transmitted frames to all other radios that are
|
||||
currently enabled and on the same channel as the transmitting
|
||||
radio. Software encryption in mac80211 is used so that the frames are
|
||||
actually encrypted over the virtual air interface to allow more
|
||||
complete testing of encryption.
|
||||
|
||||
A global monitoring netdev, hwsim#, is created independent of
|
||||
mac80211. This interface can be used to monitor all transmitted frames
|
||||
regardless of channel.
|
||||
|
||||
|
||||
Simple example
|
||||
|
||||
This example shows how to use mac80211_hwsim to simulate two radios:
|
||||
one to act as an access point and the other as a station that
|
||||
associates with the AP. hostapd and wpa_supplicant are used to take
|
||||
care of WPA2-PSK authentication. In addition, hostapd is also
|
||||
processing access point side of association.
|
||||
|
||||
|
||||
# Build mac80211_hwsim as part of kernel configuration
|
||||
|
||||
# Load the module
|
||||
modprobe mac80211_hwsim
|
||||
|
||||
# Run hostapd (AP) for wlan0
|
||||
hostapd hostapd.conf
|
||||
|
||||
# Run wpa_supplicant (station) for wlan1
|
||||
wpa_supplicant -Dwext -iwlan1 -c wpa_supplicant.conf
|
||||
|
||||
|
||||
More test cases are available in hostap.git:
|
||||
git://w1.fi/srv/git/hostap.git and mac80211_hwsim/tests subdirectory
|
||||
(http://w1.fi/gitweb/gitweb.cgi?p=hostap.git;a=tree;f=mac80211_hwsim/tests)
|
||||
11
Documentation/networking/mac80211_hwsim/hostapd.conf
Normal file
11
Documentation/networking/mac80211_hwsim/hostapd.conf
Normal file
|
|
@ -0,0 +1,11 @@
|
|||
interface=wlan0
|
||||
driver=nl80211
|
||||
|
||||
hw_mode=g
|
||||
channel=1
|
||||
ssid=mac80211 test
|
||||
|
||||
wpa=2
|
||||
wpa_key_mgmt=WPA-PSK
|
||||
wpa_pairwise=CCMP
|
||||
wpa_passphrase=12345678
|
||||
10
Documentation/networking/mac80211_hwsim/wpa_supplicant.conf
Normal file
10
Documentation/networking/mac80211_hwsim/wpa_supplicant.conf
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
ctrl_interface=/var/run/wpa_supplicant
|
||||
|
||||
network={
|
||||
ssid="mac80211 test"
|
||||
psk="12345678"
|
||||
key_mgmt=WPA-PSK
|
||||
proto=WPA2
|
||||
pairwise=CCMP
|
||||
group=CCMP
|
||||
}
|
||||
79
Documentation/networking/multiqueue.txt
Normal file
79
Documentation/networking/multiqueue.txt
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
|
||||
HOWTO for multiqueue network device support
|
||||
===========================================
|
||||
|
||||
Section 1: Base driver requirements for implementing multiqueue support
|
||||
|
||||
Intro: Kernel support for multiqueue devices
|
||||
---------------------------------------------------------
|
||||
|
||||
Kernel support for multiqueue devices is always present.
|
||||
|
||||
Section 1: Base driver requirements for implementing multiqueue support
|
||||
-----------------------------------------------------------------------
|
||||
|
||||
Base drivers are required to use the new alloc_etherdev_mq() or
|
||||
alloc_netdev_mq() functions to allocate the subqueues for the device. The
|
||||
underlying kernel API will take care of the allocation and deallocation of
|
||||
the subqueue memory, as well as netdev configuration of where the queues
|
||||
exist in memory.
|
||||
|
||||
The base driver will also need to manage the queues as it does the global
|
||||
netdev->queue_lock today. Therefore base drivers should use the
|
||||
netif_{start|stop|wake}_subqueue() functions to manage each queue while the
|
||||
device is still operational. netdev->queue_lock is still used when the device
|
||||
comes online or when it's completely shut down (unregister_netdev(), etc.).
|
||||
|
||||
|
||||
Section 2: Qdisc support for multiqueue devices
|
||||
|
||||
-----------------------------------------------
|
||||
|
||||
Currently two qdiscs are optimized for multiqueue devices. The first is the
|
||||
default pfifo_fast qdisc. This qdisc supports one qdisc per hardware queue.
|
||||
A new round-robin qdisc, sch_multiq also supports multiple hardware queues. The
|
||||
qdisc is responsible for classifying the skb's and then directing the skb's to
|
||||
bands and queues based on the value in skb->queue_mapping. Use this field in
|
||||
the base driver to determine which queue to send the skb to.
|
||||
|
||||
sch_multiq has been added for hardware that wishes to avoid head-of-line
|
||||
blocking. It will cycle though the bands and verify that the hardware queue
|
||||
associated with the band is not stopped prior to dequeuing a packet.
|
||||
|
||||
On qdisc load, the number of bands is based on the number of queues on the
|
||||
hardware. Once the association is made, any skb with skb->queue_mapping set,
|
||||
will be queued to the band associated with the hardware queue.
|
||||
|
||||
|
||||
Section 3: Brief howto using MULTIQ for multiqueue devices
|
||||
---------------------------------------------------------------
|
||||
|
||||
The userspace command 'tc,' part of the iproute2 package, is used to configure
|
||||
qdiscs. To add the MULTIQ qdisc to your network device, assuming the device
|
||||
is called eth0, run the following command:
|
||||
|
||||
# tc qdisc add dev eth0 root handle 1: multiq
|
||||
|
||||
The qdisc will allocate the number of bands to equal the number of queues that
|
||||
the device reports, and bring the qdisc online. Assuming eth0 has 4 Tx
|
||||
queues, the band mapping would look like:
|
||||
|
||||
band 0 => queue 0
|
||||
band 1 => queue 1
|
||||
band 2 => queue 2
|
||||
band 3 => queue 3
|
||||
|
||||
Traffic will begin flowing through each queue based on either the simple_tx_hash
|
||||
function or based on netdev->select_queue() if you have it defined.
|
||||
|
||||
The behavior of tc filters remains the same. However a new tc action,
|
||||
skbedit, has been added. Assuming you wanted to route all traffic to a
|
||||
specific host, for example 192.168.0.3, through a specific queue you could use
|
||||
this action and establish a filter such as:
|
||||
|
||||
tc filter add dev eth0 parent 1: protocol ip prio 1 u32 \
|
||||
match ip dst 192.168.0.3 \
|
||||
action skbedit queue_mapping 3
|
||||
|
||||
Author: Alexander Duyck <alexander.h.duyck@intel.com>
|
||||
Original Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>
|
||||
177
Documentation/networking/netconsole.txt
Normal file
177
Documentation/networking/netconsole.txt
Normal file
|
|
@ -0,0 +1,177 @@
|
|||
|
||||
started by Ingo Molnar <mingo@redhat.com>, 2001.09.17
|
||||
2.6 port and netpoll api by Matt Mackall <mpm@selenic.com>, Sep 9 2003
|
||||
IPv6 support by Cong Wang <xiyou.wangcong@gmail.com>, Jan 1 2013
|
||||
|
||||
Please send bug reports to Matt Mackall <mpm@selenic.com>
|
||||
Satyam Sharma <satyam.sharma@gmail.com>, and Cong Wang <xiyou.wangcong@gmail.com>
|
||||
|
||||
Introduction:
|
||||
=============
|
||||
|
||||
This module logs kernel printk messages over UDP allowing debugging of
|
||||
problem where disk logging fails and serial consoles are impractical.
|
||||
|
||||
It can be used either built-in or as a module. As a built-in,
|
||||
netconsole initializes immediately after NIC cards and will bring up
|
||||
the specified interface as soon as possible. While this doesn't allow
|
||||
capture of early kernel panics, it does capture most of the boot
|
||||
process.
|
||||
|
||||
Sender and receiver configuration:
|
||||
==================================
|
||||
|
||||
It takes a string configuration parameter "netconsole" in the
|
||||
following format:
|
||||
|
||||
netconsole=[src-port]@[src-ip]/[<dev>],[tgt-port]@<tgt-ip>/[tgt-macaddr]
|
||||
|
||||
where
|
||||
src-port source for UDP packets (defaults to 6665)
|
||||
src-ip source IP to use (interface address)
|
||||
dev network interface (eth0)
|
||||
tgt-port port for logging agent (6666)
|
||||
tgt-ip IP address for logging agent
|
||||
tgt-macaddr ethernet MAC address for logging agent (broadcast)
|
||||
|
||||
Examples:
|
||||
|
||||
linux netconsole=4444@10.0.0.1/eth1,9353@10.0.0.2/12:34:56:78:9a:bc
|
||||
|
||||
or
|
||||
|
||||
insmod netconsole netconsole=@/,@10.0.0.2/
|
||||
|
||||
or using IPv6
|
||||
|
||||
insmod netconsole netconsole=@/,@fd00:1:2:3::1/
|
||||
|
||||
It also supports logging to multiple remote agents by specifying
|
||||
parameters for the multiple agents separated by semicolons and the
|
||||
complete string enclosed in "quotes", thusly:
|
||||
|
||||
modprobe netconsole netconsole="@/,@10.0.0.2/;@/eth1,6892@10.0.0.3/"
|
||||
|
||||
Built-in netconsole starts immediately after the TCP stack is
|
||||
initialized and attempts to bring up the supplied dev at the supplied
|
||||
address.
|
||||
|
||||
The remote host has several options to receive the kernel messages,
|
||||
for example:
|
||||
|
||||
1) syslogd
|
||||
|
||||
2) netcat
|
||||
|
||||
On distributions using a BSD-based netcat version (e.g. Fedora,
|
||||
openSUSE and Ubuntu) the listening port must be specified without
|
||||
the -p switch:
|
||||
|
||||
'nc -u -l -p <port>' / 'nc -u -l <port>' or
|
||||
'netcat -u -l -p <port>' / 'netcat -u -l <port>'
|
||||
|
||||
3) socat
|
||||
|
||||
'socat udp-recv:<port> -'
|
||||
|
||||
Dynamic reconfiguration:
|
||||
========================
|
||||
|
||||
Dynamic reconfigurability is a useful addition to netconsole that enables
|
||||
remote logging targets to be dynamically added, removed, or have their
|
||||
parameters reconfigured at runtime from a configfs-based userspace interface.
|
||||
[ Note that the parameters of netconsole targets that were specified/created
|
||||
from the boot/module option are not exposed via this interface, and hence
|
||||
cannot be modified dynamically. ]
|
||||
|
||||
To include this feature, select CONFIG_NETCONSOLE_DYNAMIC when building the
|
||||
netconsole module (or kernel, if netconsole is built-in).
|
||||
|
||||
Some examples follow (where configfs is mounted at the /sys/kernel/config
|
||||
mountpoint).
|
||||
|
||||
To add a remote logging target (target names can be arbitrary):
|
||||
|
||||
cd /sys/kernel/config/netconsole/
|
||||
mkdir target1
|
||||
|
||||
Note that newly created targets have default parameter values (as mentioned
|
||||
above) and are disabled by default -- they must first be enabled by writing
|
||||
"1" to the "enabled" attribute (usually after setting parameters accordingly)
|
||||
as described below.
|
||||
|
||||
To remove a target:
|
||||
|
||||
rmdir /sys/kernel/config/netconsole/othertarget/
|
||||
|
||||
The interface exposes these parameters of a netconsole target to userspace:
|
||||
|
||||
enabled Is this target currently enabled? (read-write)
|
||||
dev_name Local network interface name (read-write)
|
||||
local_port Source UDP port to use (read-write)
|
||||
remote_port Remote agent's UDP port (read-write)
|
||||
local_ip Source IP address to use (read-write)
|
||||
remote_ip Remote agent's IP address (read-write)
|
||||
local_mac Local interface's MAC address (read-only)
|
||||
remote_mac Remote agent's MAC address (read-write)
|
||||
|
||||
The "enabled" attribute is also used to control whether the parameters of
|
||||
a target can be updated or not -- you can modify the parameters of only
|
||||
disabled targets (i.e. if "enabled" is 0).
|
||||
|
||||
To update a target's parameters:
|
||||
|
||||
cat enabled # check if enabled is 1
|
||||
echo 0 > enabled # disable the target (if required)
|
||||
echo eth2 > dev_name # set local interface
|
||||
echo 10.0.0.4 > remote_ip # update some parameter
|
||||
echo cb:a9:87:65:43:21 > remote_mac # update more parameters
|
||||
echo 1 > enabled # enable target again
|
||||
|
||||
You can also update the local interface dynamically. This is especially
|
||||
useful if you want to use interfaces that have newly come up (and may not
|
||||
have existed when netconsole was loaded / initialized).
|
||||
|
||||
Miscellaneous notes:
|
||||
====================
|
||||
|
||||
WARNING: the default target ethernet setting uses the broadcast
|
||||
ethernet address to send packets, which can cause increased load on
|
||||
other systems on the same ethernet segment.
|
||||
|
||||
TIP: some LAN switches may be configured to suppress ethernet broadcasts
|
||||
so it is advised to explicitly specify the remote agents' MAC addresses
|
||||
from the config parameters passed to netconsole.
|
||||
|
||||
TIP: to find out the MAC address of, say, 10.0.0.2, you may try using:
|
||||
|
||||
ping -c 1 10.0.0.2 ; /sbin/arp -n | grep 10.0.0.2
|
||||
|
||||
TIP: in case the remote logging agent is on a separate LAN subnet than
|
||||
the sender, it is suggested to try specifying the MAC address of the
|
||||
default gateway (you may use /sbin/route -n to find it out) as the
|
||||
remote MAC address instead.
|
||||
|
||||
NOTE: the network device (eth1 in the above case) can run any kind
|
||||
of other network traffic, netconsole is not intrusive. Netconsole
|
||||
might cause slight delays in other traffic if the volume of kernel
|
||||
messages is high, but should have no other impact.
|
||||
|
||||
NOTE: if you find that the remote logging agent is not receiving or
|
||||
printing all messages from the sender, it is likely that you have set
|
||||
the "console_loglevel" parameter (on the sender) to only send high
|
||||
priority messages to the console. You can change this at runtime using:
|
||||
|
||||
dmesg -n 8
|
||||
|
||||
or by specifying "debug" on the kernel command line at boot, to send
|
||||
all kernel messages to the console. A specific value for this parameter
|
||||
can also be set using the "loglevel" kernel boot option. See the
|
||||
dmesg(8) man page and Documentation/kernel-parameters.txt for details.
|
||||
|
||||
Netconsole was designed to be as instantaneous as possible, to
|
||||
enable the logging of even the most critical kernel bugs. It works
|
||||
from IRQ contexts as well, and does not enable interrupts while
|
||||
sending packets. Due to these unique needs, configuration cannot
|
||||
be more automatic, and some fundamental limitations will remain:
|
||||
only IP networks, UDP packets and ethernet devices are supported.
|
||||
224
Documentation/networking/netdev-FAQ.txt
Normal file
224
Documentation/networking/netdev-FAQ.txt
Normal file
|
|
@ -0,0 +1,224 @@
|
|||
|
||||
Information you need to know about netdev
|
||||
-----------------------------------------
|
||||
|
||||
Q: What is netdev?
|
||||
|
||||
A: It is a mailing list for all network-related Linux stuff. This includes
|
||||
anything found under net/ (i.e. core code like IPv6) and drivers/net
|
||||
(i.e. hardware specific drivers) in the Linux source tree.
|
||||
|
||||
Note that some subsystems (e.g. wireless drivers) which have a high volume
|
||||
of traffic have their own specific mailing lists.
|
||||
|
||||
The netdev list is managed (like many other Linux mailing lists) through
|
||||
VGER ( http://vger.kernel.org/ ) and archives can be found below:
|
||||
|
||||
http://marc.info/?l=linux-netdev
|
||||
http://www.spinics.net/lists/netdev/
|
||||
|
||||
Aside from subsystems like that mentioned above, all network-related Linux
|
||||
development (i.e. RFC, review, comments, etc.) takes place on netdev.
|
||||
|
||||
Q: How do the changes posted to netdev make their way into Linux?
|
||||
|
||||
A: There are always two trees (git repositories) in play. Both are driven
|
||||
by David Miller, the main network maintainer. There is the "net" tree,
|
||||
and the "net-next" tree. As you can probably guess from the names, the
|
||||
net tree is for fixes to existing code already in the mainline tree from
|
||||
Linus, and net-next is where the new code goes for the future release.
|
||||
You can find the trees here:
|
||||
|
||||
http://git.kernel.org/?p=linux/kernel/git/davem/net.git
|
||||
http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git
|
||||
|
||||
Q: How often do changes from these trees make it to the mainline Linus tree?
|
||||
|
||||
A: To understand this, you need to know a bit of background information
|
||||
on the cadence of Linux development. Each new release starts off with
|
||||
a two week "merge window" where the main maintainers feed their new
|
||||
stuff to Linus for merging into the mainline tree. After the two weeks,
|
||||
the merge window is closed, and it is called/tagged "-rc1". No new
|
||||
features get mainlined after this -- only fixes to the rc1 content
|
||||
are expected. After roughly a week of collecting fixes to the rc1
|
||||
content, rc2 is released. This repeats on a roughly weekly basis
|
||||
until rc7 (typically; sometimes rc6 if things are quiet, or rc8 if
|
||||
things are in a state of churn), and a week after the last vX.Y-rcN
|
||||
was done, the official "vX.Y" is released.
|
||||
|
||||
Relating that to netdev: At the beginning of the 2-week merge window,
|
||||
the net-next tree will be closed - no new changes/features. The
|
||||
accumulated new content of the past ~10 weeks will be passed onto
|
||||
mainline/Linus via a pull request for vX.Y -- at the same time,
|
||||
the "net" tree will start accumulating fixes for this pulled content
|
||||
relating to vX.Y
|
||||
|
||||
An announcement indicating when net-next has been closed is usually
|
||||
sent to netdev, but knowing the above, you can predict that in advance.
|
||||
|
||||
IMPORTANT: Do not send new net-next content to netdev during the
|
||||
period during which net-next tree is closed.
|
||||
|
||||
Shortly after the two weeks have passed (and vX.Y-rc1 is released), the
|
||||
tree for net-next reopens to collect content for the next (vX.Y+1) release.
|
||||
|
||||
If you aren't subscribed to netdev and/or are simply unsure if net-next
|
||||
has re-opened yet, simply check the net-next git repository link above for
|
||||
any new networking-related commits.
|
||||
|
||||
The "net" tree continues to collect fixes for the vX.Y content, and
|
||||
is fed back to Linus at regular (~weekly) intervals. Meaning that the
|
||||
focus for "net" is on stabilization and bugfixes.
|
||||
|
||||
Finally, the vX.Y gets released, and the whole cycle starts over.
|
||||
|
||||
Q: So where are we now in this cycle?
|
||||
|
||||
A: Load the mainline (Linus) page here:
|
||||
|
||||
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git
|
||||
|
||||
and note the top of the "tags" section. If it is rc1, it is early
|
||||
in the dev cycle. If it was tagged rc7 a week ago, then a release
|
||||
is probably imminent.
|
||||
|
||||
Q: How do I indicate which tree (net vs. net-next) my patch should be in?
|
||||
|
||||
A: Firstly, think whether you have a bug fix or new "next-like" content.
|
||||
Then once decided, assuming that you use git, use the prefix flag, i.e.
|
||||
|
||||
git format-patch --subject-prefix='PATCH net-next' start..finish
|
||||
|
||||
Use "net" instead of "net-next" (always lower case) in the above for
|
||||
bug-fix net content. If you don't use git, then note the only magic in
|
||||
the above is just the subject text of the outgoing e-mail, and you can
|
||||
manually change it yourself with whatever MUA you are comfortable with.
|
||||
|
||||
Q: I sent a patch and I'm wondering what happened to it. How can I tell
|
||||
whether it got merged?
|
||||
|
||||
A: Start by looking at the main patchworks queue for netdev:
|
||||
|
||||
http://patchwork.ozlabs.org/project/netdev/list/
|
||||
|
||||
The "State" field will tell you exactly where things are at with
|
||||
your patch.
|
||||
|
||||
Q: The above only says "Under Review". How can I find out more?
|
||||
|
||||
A: Generally speaking, the patches get triaged quickly (in less than 48h).
|
||||
So be patient. Asking the maintainer for status updates on your
|
||||
patch is a good way to ensure your patch is ignored or pushed to
|
||||
the bottom of the priority list.
|
||||
|
||||
Q: How can I tell what patches are queued up for backporting to the
|
||||
various stable releases?
|
||||
|
||||
A: Normally Greg Kroah-Hartman collects stable commits himself, but
|
||||
for networking, Dave collects up patches he deems critical for the
|
||||
networking subsystem, and then hands them off to Greg.
|
||||
|
||||
There is a patchworks queue that you can see here:
|
||||
http://patchwork.ozlabs.org/bundle/davem/stable/?state=*
|
||||
|
||||
It contains the patches which Dave has selected, but not yet handed
|
||||
off to Greg. If Greg already has the patch, then it will be here:
|
||||
http://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git
|
||||
|
||||
A quick way to find whether the patch is in this stable-queue is
|
||||
to simply clone the repo, and then git grep the mainline commit ID, e.g.
|
||||
|
||||
stable-queue$ git grep -l 284041ef21fdf2e
|
||||
releases/3.0.84/ipv6-fix-possible-crashes-in-ip6_cork_release.patch
|
||||
releases/3.4.51/ipv6-fix-possible-crashes-in-ip6_cork_release.patch
|
||||
releases/3.9.8/ipv6-fix-possible-crashes-in-ip6_cork_release.patch
|
||||
stable/stable-queue$
|
||||
|
||||
Q: I see a network patch and I think it should be backported to stable.
|
||||
Should I request it via "stable@vger.kernel.org" like the references in
|
||||
the kernel's Documentation/stable_kernel_rules.txt file say?
|
||||
|
||||
A: No, not for networking. Check the stable queues as per above 1st to see
|
||||
if it is already queued. If not, then send a mail to netdev, listing
|
||||
the upstream commit ID and why you think it should be a stable candidate.
|
||||
|
||||
Before you jump to go do the above, do note that the normal stable rules
|
||||
in Documentation/stable_kernel_rules.txt still apply. So you need to
|
||||
explicitly indicate why it is a critical fix and exactly what users are
|
||||
impacted. In addition, you need to convince yourself that you _really_
|
||||
think it has been overlooked, vs. having been considered and rejected.
|
||||
|
||||
Generally speaking, the longer it has had a chance to "soak" in mainline,
|
||||
the better the odds that it is an OK candidate for stable. So scrambling
|
||||
to request a commit be added the day after it appears should be avoided.
|
||||
|
||||
Q: I have created a network patch and I think it should be backported to
|
||||
stable. Should I add a "Cc: stable@vger.kernel.org" like the references
|
||||
in the kernel's Documentation/ directory say?
|
||||
|
||||
A: No. See above answer. In short, if you think it really belongs in
|
||||
stable, then ensure you write a decent commit log that describes who
|
||||
gets impacted by the bugfix and how it manifests itself, and when the
|
||||
bug was introduced. If you do that properly, then the commit will
|
||||
get handled appropriately and most likely get put in the patchworks
|
||||
stable queue if it really warrants it.
|
||||
|
||||
If you think there is some valid information relating to it being in
|
||||
stable that does _not_ belong in the commit log, then use the three
|
||||
dash marker line as described in Documentation/SubmittingPatches to
|
||||
temporarily embed that information into the patch that you send.
|
||||
|
||||
Q: Someone said that the comment style and coding convention is different
|
||||
for the networking content. Is this true?
|
||||
|
||||
A: Yes, in a largely trivial way. Instead of this:
|
||||
|
||||
/*
|
||||
* foobar blah blah blah
|
||||
* another line of text
|
||||
*/
|
||||
|
||||
it is requested that you make it look like this:
|
||||
|
||||
/* foobar blah blah blah
|
||||
* another line of text
|
||||
*/
|
||||
|
||||
Q: I am working in existing code that has the former comment style and not the
|
||||
latter. Should I submit new code in the former style or the latter?
|
||||
|
||||
A: Make it the latter style, so that eventually all code in the domain of
|
||||
netdev is of this format.
|
||||
|
||||
Q: I found a bug that might have possible security implications or similar.
|
||||
Should I mail the main netdev maintainer off-list?
|
||||
|
||||
A: No. The current netdev maintainer has consistently requested that people
|
||||
use the mailing lists and not reach out directly. If you aren't OK with
|
||||
that, then perhaps consider mailing "security@kernel.org" or reading about
|
||||
http://oss-security.openwall.org/wiki/mailing-lists/distros
|
||||
as possible alternative mechanisms.
|
||||
|
||||
Q: What level of testing is expected before I submit my change?
|
||||
|
||||
A: If your changes are against net-next, the expectation is that you
|
||||
have tested by layering your changes on top of net-next. Ideally you
|
||||
will have done run-time testing specific to your change, but at a
|
||||
minimum, your changes should survive an "allyesconfig" and an
|
||||
"allmodconfig" build without new warnings or failures.
|
||||
|
||||
Q: Any other tips to help ensure my net/net-next patch gets OK'd?
|
||||
|
||||
A: Attention to detail. Re-read your own work as if you were the
|
||||
reviewer. You can start with using checkpatch.pl, perhaps even
|
||||
with the "--strict" flag. But do not be mindlessly robotic in
|
||||
doing so. If your change is a bug-fix, make sure your commit log
|
||||
indicates the end-user visible symptom, the underlying reason as
|
||||
to why it happens, and then if necessary, explain why the fix proposed
|
||||
is the best way to get things done. Don't mangle whitespace, and as
|
||||
is common, don't mis-indent function arguments that span multiple lines.
|
||||
If it is your first patch, mail it to yourself so you can test apply
|
||||
it to an unpatched tree to confirm infrastructure didn't mangle it.
|
||||
|
||||
Finally, go back and read Documentation/SubmittingPatches to be
|
||||
sure you are not repeating some common mistake documented there.
|
||||
167
Documentation/networking/netdev-features.txt
Normal file
167
Documentation/networking/netdev-features.txt
Normal file
|
|
@ -0,0 +1,167 @@
|
|||
Netdev features mess and how to get out from it alive
|
||||
=====================================================
|
||||
|
||||
Author:
|
||||
Michał Mirosław <mirq-linux@rere.qmqm.pl>
|
||||
|
||||
|
||||
|
||||
Part I: Feature sets
|
||||
======================
|
||||
|
||||
Long gone are the days when a network card would just take and give packets
|
||||
verbatim. Today's devices add multiple features and bugs (read: offloads)
|
||||
that relieve an OS of various tasks like generating and checking checksums,
|
||||
splitting packets, classifying them. Those capabilities and their state
|
||||
are commonly referred to as netdev features in Linux kernel world.
|
||||
|
||||
There are currently three sets of features relevant to the driver, and
|
||||
one used internally by network core:
|
||||
|
||||
1. netdev->hw_features set contains features whose state may possibly
|
||||
be changed (enabled or disabled) for a particular device by user's
|
||||
request. This set should be initialized in ndo_init callback and not
|
||||
changed later.
|
||||
|
||||
2. netdev->features set contains features which are currently enabled
|
||||
for a device. This should be changed only by network core or in
|
||||
error paths of ndo_set_features callback.
|
||||
|
||||
3. netdev->vlan_features set contains features whose state is inherited
|
||||
by child VLAN devices (limits netdev->features set). This is currently
|
||||
used for all VLAN devices whether tags are stripped or inserted in
|
||||
hardware or software.
|
||||
|
||||
4. netdev->wanted_features set contains feature set requested by user.
|
||||
This set is filtered by ndo_fix_features callback whenever it or
|
||||
some device-specific conditions change. This set is internal to
|
||||
networking core and should not be referenced in drivers.
|
||||
|
||||
|
||||
|
||||
Part II: Controlling enabled features
|
||||
=======================================
|
||||
|
||||
When current feature set (netdev->features) is to be changed, new set
|
||||
is calculated and filtered by calling ndo_fix_features callback
|
||||
and netdev_fix_features(). If the resulting set differs from current
|
||||
set, it is passed to ndo_set_features callback and (if the callback
|
||||
returns success) replaces value stored in netdev->features.
|
||||
NETDEV_FEAT_CHANGE notification is issued after that whenever current
|
||||
set might have changed.
|
||||
|
||||
The following events trigger recalculation:
|
||||
1. device's registration, after ndo_init returned success
|
||||
2. user requested changes in features state
|
||||
3. netdev_update_features() is called
|
||||
|
||||
ndo_*_features callbacks are called with rtnl_lock held. Missing callbacks
|
||||
are treated as always returning success.
|
||||
|
||||
A driver that wants to trigger recalculation must do so by calling
|
||||
netdev_update_features() while holding rtnl_lock. This should not be done
|
||||
from ndo_*_features callbacks. netdev->features should not be modified by
|
||||
driver except by means of ndo_fix_features callback.
|
||||
|
||||
|
||||
|
||||
Part III: Implementation hints
|
||||
================================
|
||||
|
||||
* ndo_fix_features:
|
||||
|
||||
All dependencies between features should be resolved here. The resulting
|
||||
set can be reduced further by networking core imposed limitations (as coded
|
||||
in netdev_fix_features()). For this reason it is safer to disable a feature
|
||||
when its dependencies are not met instead of forcing the dependency on.
|
||||
|
||||
This callback should not modify hardware nor driver state (should be
|
||||
stateless). It can be called multiple times between successive
|
||||
ndo_set_features calls.
|
||||
|
||||
Callback must not alter features contained in NETIF_F_SOFT_FEATURES or
|
||||
NETIF_F_NEVER_CHANGE sets. The exception is NETIF_F_VLAN_CHALLENGED but
|
||||
care must be taken as the change won't affect already configured VLANs.
|
||||
|
||||
* ndo_set_features:
|
||||
|
||||
Hardware should be reconfigured to match passed feature set. The set
|
||||
should not be altered unless some error condition happens that can't
|
||||
be reliably detected in ndo_fix_features. In this case, the callback
|
||||
should update netdev->features to match resulting hardware state.
|
||||
Errors returned are not (and cannot be) propagated anywhere except dmesg.
|
||||
(Note: successful return is zero, >0 means silent error.)
|
||||
|
||||
|
||||
|
||||
Part IV: Features
|
||||
===================
|
||||
|
||||
For current list of features, see include/linux/netdev_features.h.
|
||||
This section describes semantics of some of them.
|
||||
|
||||
* Transmit checksumming
|
||||
|
||||
For complete description, see comments near the top of include/linux/skbuff.h.
|
||||
|
||||
Note: NETIF_F_HW_CSUM is a superset of NETIF_F_IP_CSUM + NETIF_F_IPV6_CSUM.
|
||||
It means that device can fill TCP/UDP-like checksum anywhere in the packets
|
||||
whatever headers there might be.
|
||||
|
||||
* Transmit TCP segmentation offload
|
||||
|
||||
NETIF_F_TSO_ECN means that hardware can properly split packets with CWR bit
|
||||
set, be it TCPv4 (when NETIF_F_TSO is enabled) or TCPv6 (NETIF_F_TSO6).
|
||||
|
||||
* Transmit DMA from high memory
|
||||
|
||||
On platforms where this is relevant, NETIF_F_HIGHDMA signals that
|
||||
ndo_start_xmit can handle skbs with frags in high memory.
|
||||
|
||||
* Transmit scatter-gather
|
||||
|
||||
Those features say that ndo_start_xmit can handle fragmented skbs:
|
||||
NETIF_F_SG --- paged skbs (skb_shinfo()->frags), NETIF_F_FRAGLIST ---
|
||||
chained skbs (skb->next/prev list).
|
||||
|
||||
* Software features
|
||||
|
||||
Features contained in NETIF_F_SOFT_FEATURES are features of networking
|
||||
stack. Driver should not change behaviour based on them.
|
||||
|
||||
* LLTX driver (deprecated for hardware drivers)
|
||||
|
||||
NETIF_F_LLTX should be set in drivers that implement their own locking in
|
||||
transmit path or don't need locking at all (e.g. software tunnels).
|
||||
In ndo_start_xmit, it is recommended to use a try_lock and return
|
||||
NETDEV_TX_LOCKED when the spin lock fails. The locking should also properly
|
||||
protect against other callbacks (the rules you need to find out).
|
||||
|
||||
Don't use it for new drivers.
|
||||
|
||||
* netns-local device
|
||||
|
||||
NETIF_F_NETNS_LOCAL is set for devices that are not allowed to move between
|
||||
network namespaces (e.g. loopback).
|
||||
|
||||
Don't use it in drivers.
|
||||
|
||||
* VLAN challenged
|
||||
|
||||
NETIF_F_VLAN_CHALLENGED should be set for devices which can't cope with VLAN
|
||||
headers. Some drivers set this because the cards can't handle the bigger MTU.
|
||||
[FIXME: Those cases could be fixed in VLAN code by allowing only reduced-MTU
|
||||
VLANs. This may be not useful, though.]
|
||||
|
||||
* rx-fcs
|
||||
|
||||
This requests that the NIC append the Ethernet Frame Checksum (FCS)
|
||||
to the end of the skb data. This allows sniffers and other tools to
|
||||
read the CRC recorded by the NIC on receipt of the packet.
|
||||
|
||||
* rx-all
|
||||
|
||||
This requests that the NIC receive all possible frames, including errored
|
||||
frames (such as bad FCS, etc). This can be helpful when sniffing a link with
|
||||
bad packets on it. Some NICs may receive more packets if also put into normal
|
||||
PROMISC mode.
|
||||
107
Documentation/networking/netdevices.txt
Normal file
107
Documentation/networking/netdevices.txt
Normal file
|
|
@ -0,0 +1,107 @@
|
|||
|
||||
Network Devices, the Kernel, and You!
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
The following is a random collection of documentation regarding
|
||||
network devices.
|
||||
|
||||
struct net_device allocation rules
|
||||
==================================
|
||||
Network device structures need to persist even after module is unloaded and
|
||||
must be allocated with alloc_netdev_mqs() and friends.
|
||||
If device has registered successfully, it will be freed on last use
|
||||
by free_netdev(). This is required to handle the pathologic case cleanly
|
||||
(example: rmmod mydriver </sys/class/net/myeth/mtu )
|
||||
|
||||
alloc_netdev_mqs()/alloc_netdev() reserve extra space for driver
|
||||
private data which gets freed when the network device is freed. If
|
||||
separately allocated data is attached to the network device
|
||||
(netdev_priv(dev)) then it is up to the module exit handler to free that.
|
||||
|
||||
MTU
|
||||
===
|
||||
Each network device has a Maximum Transfer Unit. The MTU does not
|
||||
include any link layer protocol overhead. Upper layer protocols must
|
||||
not pass a socket buffer (skb) to a device to transmit with more data
|
||||
than the mtu. The MTU does not include link layer header overhead, so
|
||||
for example on Ethernet if the standard MTU is 1500 bytes used, the
|
||||
actual skb will contain up to 1514 bytes because of the Ethernet
|
||||
header. Devices should allow for the 4 byte VLAN header as well.
|
||||
|
||||
Segmentation Offload (GSO, TSO) is an exception to this rule. The
|
||||
upper layer protocol may pass a large socket buffer to the device
|
||||
transmit routine, and the device will break that up into separate
|
||||
packets based on the current MTU.
|
||||
|
||||
MTU is symmetrical and applies both to receive and transmit. A device
|
||||
must be able to receive at least the maximum size packet allowed by
|
||||
the MTU. A network device may use the MTU as mechanism to size receive
|
||||
buffers, but the device should allow packets with VLAN header. With
|
||||
standard Ethernet mtu of 1500 bytes, the device should allow up to
|
||||
1518 byte packets (1500 + 14 header + 4 tag). The device may either:
|
||||
drop, truncate, or pass up oversize packets, but dropping oversize
|
||||
packets is preferred.
|
||||
|
||||
|
||||
struct net_device synchronization rules
|
||||
=======================================
|
||||
ndo_open:
|
||||
Synchronization: rtnl_lock() semaphore.
|
||||
Context: process
|
||||
|
||||
ndo_stop:
|
||||
Synchronization: rtnl_lock() semaphore.
|
||||
Context: process
|
||||
Note: netif_running() is guaranteed false
|
||||
|
||||
ndo_do_ioctl:
|
||||
Synchronization: rtnl_lock() semaphore.
|
||||
Context: process
|
||||
|
||||
ndo_get_stats:
|
||||
Synchronization: dev_base_lock rwlock.
|
||||
Context: nominally process, but don't sleep inside an rwlock
|
||||
|
||||
ndo_start_xmit:
|
||||
Synchronization: __netif_tx_lock spinlock.
|
||||
|
||||
When the driver sets NETIF_F_LLTX in dev->features this will be
|
||||
called without holding netif_tx_lock. In this case the driver
|
||||
has to lock by itself when needed. It is recommended to use a try lock
|
||||
for this and return NETDEV_TX_LOCKED when the spin lock fails.
|
||||
The locking there should also properly protect against
|
||||
set_rx_mode. Note that the use of NETIF_F_LLTX is deprecated.
|
||||
Don't use it for new drivers.
|
||||
|
||||
Context: Process with BHs disabled or BH (timer),
|
||||
will be called with interrupts disabled by netconsole.
|
||||
|
||||
Return codes:
|
||||
o NETDEV_TX_OK everything ok.
|
||||
o NETDEV_TX_BUSY Cannot transmit packet, try later
|
||||
Usually a bug, means queue start/stop flow control is broken in
|
||||
the driver. Note: the driver must NOT put the skb in its DMA ring.
|
||||
o NETDEV_TX_LOCKED Locking failed, please retry quickly.
|
||||
Only valid when NETIF_F_LLTX is set.
|
||||
|
||||
ndo_tx_timeout:
|
||||
Synchronization: netif_tx_lock spinlock; all TX queues frozen.
|
||||
Context: BHs disabled
|
||||
Notes: netif_queue_stopped() is guaranteed true
|
||||
|
||||
ndo_set_rx_mode:
|
||||
Synchronization: netif_addr_lock spinlock.
|
||||
Context: BHs disabled
|
||||
|
||||
struct napi_struct synchronization rules
|
||||
========================================
|
||||
napi->poll:
|
||||
Synchronization: NAPI_STATE_SCHED bit in napi->state. Device
|
||||
driver's ndo_stop method will invoke napi_disable() on
|
||||
all NAPI instances which will do a sleeping poll on the
|
||||
NAPI_STATE_SCHED napi->state bit, waiting for all pending
|
||||
NAPI activity to cease.
|
||||
Context: softirq
|
||||
will be called with interrupts disabled by netconsole.
|
||||
79
Documentation/networking/netif-msg.txt
Normal file
79
Documentation/networking/netif-msg.txt
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
|
||||
________________
|
||||
NETIF Msg Level
|
||||
|
||||
The design of the network interface message level setting.
|
||||
|
||||
History
|
||||
|
||||
The design of the debugging message interface was guided and
|
||||
constrained by backwards compatibility previous practice. It is useful
|
||||
to understand the history and evolution in order to understand current
|
||||
practice and relate it to older driver source code.
|
||||
|
||||
From the beginning of Linux, each network device driver has had a local
|
||||
integer variable that controls the debug message level. The message
|
||||
level ranged from 0 to 7, and monotonically increased in verbosity.
|
||||
|
||||
The message level was not precisely defined past level 3, but were
|
||||
always implemented within +-1 of the specified level. Drivers tended
|
||||
to shed the more verbose level messages as they matured.
|
||||
0 Minimal messages, only essential information on fatal errors.
|
||||
1 Standard messages, initialization status. No run-time messages
|
||||
2 Special media selection messages, generally timer-driver.
|
||||
3 Interface starts and stops, including normal status messages
|
||||
4 Tx and Rx frame error messages, and abnormal driver operation
|
||||
5 Tx packet queue information, interrupt events.
|
||||
6 Status on each completed Tx packet and received Rx packets
|
||||
7 Initial contents of Tx and Rx packets
|
||||
|
||||
Initially this message level variable was uniquely named in each driver
|
||||
e.g. "lance_debug", so that a kernel symbolic debugger could locate and
|
||||
modify the setting. When kernel modules became common, the variables
|
||||
were consistently renamed to "debug" and allowed to be set as a module
|
||||
parameter.
|
||||
|
||||
This approach worked well. However there is always a demand for
|
||||
additional features. Over the years the following emerged as
|
||||
reasonable and easily implemented enhancements
|
||||
Using an ioctl() call to modify the level.
|
||||
Per-interface rather than per-driver message level setting.
|
||||
More selective control over the type of messages emitted.
|
||||
|
||||
The netif_msg recommendation adds these features with only a minor
|
||||
complexity and code size increase.
|
||||
|
||||
The recommendation is the following points
|
||||
Retaining the per-driver integer variable "debug" as a module
|
||||
parameter with a default level of '1'.
|
||||
|
||||
Adding a per-interface private variable named "msg_enable". The
|
||||
variable is a bit map rather than a level, and is initialized as
|
||||
1 << debug
|
||||
Or more precisely
|
||||
debug < 0 ? 0 : 1 << min(sizeof(int)-1, debug)
|
||||
|
||||
Messages should changes from
|
||||
if (debug > 1)
|
||||
printk(MSG_DEBUG "%s: ...
|
||||
to
|
||||
if (np->msg_enable & NETIF_MSG_LINK)
|
||||
printk(MSG_DEBUG "%s: ...
|
||||
|
||||
|
||||
The set of message levels is named
|
||||
Old level Name Bit position
|
||||
0 NETIF_MSG_DRV 0x0001
|
||||
1 NETIF_MSG_PROBE 0x0002
|
||||
2 NETIF_MSG_LINK 0x0004
|
||||
2 NETIF_MSG_TIMER 0x0004
|
||||
3 NETIF_MSG_IFDOWN 0x0008
|
||||
3 NETIF_MSG_IFUP 0x0008
|
||||
4 NETIF_MSG_RX_ERR 0x0010
|
||||
4 NETIF_MSG_TX_ERR 0x0010
|
||||
5 NETIF_MSG_TX_QUEUED 0x0020
|
||||
5 NETIF_MSG_INTR 0x0020
|
||||
6 NETIF_MSG_TX_DONE 0x0040
|
||||
6 NETIF_MSG_RX_STATUS 0x0040
|
||||
7 NETIF_MSG_PKTDATA 0x0080
|
||||
|
||||
339
Documentation/networking/netlink_mmap.txt
Normal file
339
Documentation/networking/netlink_mmap.txt
Normal file
|
|
@ -0,0 +1,339 @@
|
|||
This file documents how to use memory mapped I/O with netlink.
|
||||
|
||||
Author: Patrick McHardy <kaber@trash.net>
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
Memory mapped netlink I/O can be used to increase throughput and decrease
|
||||
overhead of unicast receive and transmit operations. Some netlink subsystems
|
||||
require high throughput, these are mainly the netfilter subsystems
|
||||
nfnetlink_queue and nfnetlink_log, but it can also help speed up large
|
||||
dump operations of f.i. the routing database.
|
||||
|
||||
Memory mapped netlink I/O used two circular ring buffers for RX and TX which
|
||||
are mapped into the processes address space.
|
||||
|
||||
The RX ring is used by the kernel to directly construct netlink messages into
|
||||
user-space memory without copying them as done with regular socket I/O,
|
||||
additionally as long as the ring contains messages no recvmsg() or poll()
|
||||
syscalls have to be issued by user-space to get more message.
|
||||
|
||||
The TX ring is used to process messages directly from user-space memory, the
|
||||
kernel processes all messages contained in the ring using a single sendmsg()
|
||||
call.
|
||||
|
||||
Usage overview
|
||||
--------------
|
||||
|
||||
In order to use memory mapped netlink I/O, user-space needs three main changes:
|
||||
|
||||
- ring setup
|
||||
- conversion of the RX path to get messages from the ring instead of recvmsg()
|
||||
- conversion of the TX path to construct messages into the ring
|
||||
|
||||
Ring setup is done using setsockopt() to provide the ring parameters to the
|
||||
kernel, then a call to mmap() to map the ring into the processes address space:
|
||||
|
||||
- setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, ¶ms, sizeof(params));
|
||||
- setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, ¶ms, sizeof(params));
|
||||
- ring = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0)
|
||||
|
||||
Usage of either ring is optional, but even if only the RX ring is used the
|
||||
mapping still needs to be writable in order to update the frame status after
|
||||
processing.
|
||||
|
||||
Conversion of the reception path involves calling poll() on the file
|
||||
descriptor, once the socket is readable the frames from the ring are
|
||||
processed in order until no more messages are available, as indicated by
|
||||
a status word in the frame header.
|
||||
|
||||
On kernel side, in order to make use of memory mapped I/O on receive, the
|
||||
originating netlink subsystem needs to support memory mapped I/O, otherwise
|
||||
it will use an allocated socket buffer as usual and the contents will be
|
||||
copied to the ring on transmission, nullifying most of the performance gains.
|
||||
Dumps of kernel databases automatically support memory mapped I/O.
|
||||
|
||||
Conversion of the transmit path involves changing message construction to
|
||||
use memory from the TX ring instead of (usually) a buffer declared on the
|
||||
stack and setting up the frame header appropriately. Optionally poll() can
|
||||
be used to wait for free frames in the TX ring.
|
||||
|
||||
Structured and definitions for using memory mapped I/O are contained in
|
||||
<linux/netlink.h>.
|
||||
|
||||
RX and TX rings
|
||||
----------------
|
||||
|
||||
Each ring contains a number of continuous memory blocks, containing frames of
|
||||
fixed size dependent on the parameters used for ring setup.
|
||||
|
||||
Ring: [ block 0 ]
|
||||
[ frame 0 ]
|
||||
[ frame 1 ]
|
||||
[ block 1 ]
|
||||
[ frame 2 ]
|
||||
[ frame 3 ]
|
||||
...
|
||||
[ block n ]
|
||||
[ frame 2 * n ]
|
||||
[ frame 2 * n + 1 ]
|
||||
|
||||
The blocks are only visible to the kernel, from the point of view of user-space
|
||||
the ring just contains the frames in a continuous memory zone.
|
||||
|
||||
The ring parameters used for setting up the ring are defined as follows:
|
||||
|
||||
struct nl_mmap_req {
|
||||
unsigned int nm_block_size;
|
||||
unsigned int nm_block_nr;
|
||||
unsigned int nm_frame_size;
|
||||
unsigned int nm_frame_nr;
|
||||
};
|
||||
|
||||
Frames are grouped into blocks, where each block is a continuous region of memory
|
||||
and holds nm_block_size / nm_frame_size frames. The total number of frames in
|
||||
the ring is nm_frame_nr. The following invariants hold:
|
||||
|
||||
- frames_per_block = nm_block_size / nm_frame_size
|
||||
|
||||
- nm_frame_nr = frames_per_block * nm_block_nr
|
||||
|
||||
Some parameters are constrained, specifically:
|
||||
|
||||
- nm_block_size must be a multiple of the architectures memory page size.
|
||||
The getpagesize() function can be used to get the page size.
|
||||
|
||||
- nm_frame_size must be equal or larger to NL_MMAP_HDRLEN, IOW a frame must be
|
||||
able to hold at least the frame header
|
||||
|
||||
- nm_frame_size must be smaller or equal to nm_block_size
|
||||
|
||||
- nm_frame_size must be a multiple of NL_MMAP_MSG_ALIGNMENT
|
||||
|
||||
- nm_frame_nr must equal the actual number of frames as specified above.
|
||||
|
||||
When the kernel can't allocate physically continuous memory for a ring block,
|
||||
it will fall back to use physically discontinuous memory. This might affect
|
||||
performance negatively, in order to avoid this the nm_frame_size parameter
|
||||
should be chosen to be as small as possible for the required frame size and
|
||||
the number of blocks should be increased instead.
|
||||
|
||||
Ring frames
|
||||
------------
|
||||
|
||||
Each frames contain a frame header, consisting of a synchronization word and some
|
||||
meta-data, and the message itself.
|
||||
|
||||
Frame: [ header message ]
|
||||
|
||||
The frame header is defined as follows:
|
||||
|
||||
struct nl_mmap_hdr {
|
||||
unsigned int nm_status;
|
||||
unsigned int nm_len;
|
||||
__u32 nm_group;
|
||||
/* credentials */
|
||||
__u32 nm_pid;
|
||||
__u32 nm_uid;
|
||||
__u32 nm_gid;
|
||||
};
|
||||
|
||||
- nm_status is used for synchronizing processing between the kernel and user-
|
||||
space and specifies ownership of the frame as well as the operation to perform
|
||||
|
||||
- nm_len contains the length of the message contained in the data area
|
||||
|
||||
- nm_group specified the destination multicast group of message
|
||||
|
||||
- nm_pid, nm_uid and nm_gid contain the netlink pid, UID and GID of the sending
|
||||
process. These values correspond to the data available using SOCK_PASSCRED in
|
||||
the SCM_CREDENTIALS cmsg.
|
||||
|
||||
The possible values in the status word are:
|
||||
|
||||
- NL_MMAP_STATUS_UNUSED:
|
||||
RX ring: frame belongs to the kernel and contains no message
|
||||
for user-space. Approriate action is to invoke poll()
|
||||
to wait for new messages.
|
||||
|
||||
TX ring: frame belongs to user-space and can be used for
|
||||
message construction.
|
||||
|
||||
- NL_MMAP_STATUS_RESERVED:
|
||||
RX ring only: frame is currently used by the kernel for message
|
||||
construction and contains no valid message yet.
|
||||
Appropriate action is to invoke poll() to wait for
|
||||
new messages.
|
||||
|
||||
- NL_MMAP_STATUS_VALID:
|
||||
RX ring: frame contains a valid message. Approriate action is
|
||||
to process the message and release the frame back to
|
||||
the kernel by setting the status to
|
||||
NL_MMAP_STATUS_UNUSED or queue the frame by setting the
|
||||
status to NL_MMAP_STATUS_SKIP.
|
||||
|
||||
TX ring: the frame contains a valid message from user-space to
|
||||
be processed by the kernel. After completing processing
|
||||
the kernel will release the frame back to user-space by
|
||||
setting the status to NL_MMAP_STATUS_UNUSED.
|
||||
|
||||
- NL_MMAP_STATUS_COPY:
|
||||
RX ring only: a message is ready to be processed but could not be
|
||||
stored in the ring, either because it exceeded the
|
||||
frame size or because the originating subsystem does
|
||||
not support memory mapped I/O. Appropriate action is
|
||||
to invoke recvmsg() to receive the message and release
|
||||
the frame back to the kernel by setting the status to
|
||||
NL_MMAP_STATUS_UNUSED.
|
||||
|
||||
- NL_MMAP_STATUS_SKIP:
|
||||
RX ring only: user-space queued the message for later processing, but
|
||||
processed some messages following it in the ring. The
|
||||
kernel should skip this frame when looking for unused
|
||||
frames.
|
||||
|
||||
The data area of a frame begins at a offset of NL_MMAP_HDRLEN relative to the
|
||||
frame header.
|
||||
|
||||
TX limitations
|
||||
--------------
|
||||
|
||||
Kernel processing usually involves validation of the message received by
|
||||
user-space, then processing its contents. The kernel must assure that
|
||||
userspace is not able to modify the message contents after they have been
|
||||
validated. In order to do so, the message is copied from the ring frame
|
||||
to an allocated buffer if either of these conditions is false:
|
||||
|
||||
- only a single mapping of the ring exists
|
||||
- the file descriptor is not shared between processes
|
||||
|
||||
This means that for threaded programs, the kernel will fall back to copying.
|
||||
|
||||
Example
|
||||
-------
|
||||
|
||||
Ring setup:
|
||||
|
||||
unsigned int block_size = 16 * getpagesize();
|
||||
struct nl_mmap_req req = {
|
||||
.nm_block_size = block_size,
|
||||
.nm_block_nr = 64,
|
||||
.nm_frame_size = 16384,
|
||||
.nm_frame_nr = 64 * block_size / 16384,
|
||||
};
|
||||
unsigned int ring_size;
|
||||
void *rx_ring, *tx_ring;
|
||||
|
||||
/* Configure ring parameters */
|
||||
if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, &req, sizeof(req)) < 0)
|
||||
exit(1);
|
||||
if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, &req, sizeof(req)) < 0)
|
||||
exit(1)
|
||||
|
||||
/* Calculate size of each individual ring */
|
||||
ring_size = req.nm_block_nr * req.nm_block_size;
|
||||
|
||||
/* Map RX/TX rings. The TX ring is located after the RX ring */
|
||||
rx_ring = mmap(NULL, 2 * ring_size, PROT_READ | PROT_WRITE,
|
||||
MAP_SHARED, fd, 0);
|
||||
if ((long)rx_ring == -1L)
|
||||
exit(1);
|
||||
tx_ring = rx_ring + ring_size:
|
||||
|
||||
Message reception:
|
||||
|
||||
This example assumes some ring parameters of the ring setup are available.
|
||||
|
||||
unsigned int frame_offset = 0;
|
||||
struct nl_mmap_hdr *hdr;
|
||||
struct nlmsghdr *nlh;
|
||||
unsigned char buf[16384];
|
||||
ssize_t len;
|
||||
|
||||
while (1) {
|
||||
struct pollfd pfds[1];
|
||||
|
||||
pfds[0].fd = fd;
|
||||
pfds[0].events = POLLIN | POLLERR;
|
||||
pfds[0].revents = 0;
|
||||
|
||||
if (poll(pfds, 1, -1) < 0 && errno != -EINTR)
|
||||
exit(1);
|
||||
|
||||
/* Check for errors. Error handling omitted */
|
||||
if (pfds[0].revents & POLLERR)
|
||||
<handle error>
|
||||
|
||||
/* If no new messages, poll again */
|
||||
if (!(pfds[0].revents & POLLIN))
|
||||
continue;
|
||||
|
||||
/* Process all frames */
|
||||
while (1) {
|
||||
/* Get next frame header */
|
||||
hdr = rx_ring + frame_offset;
|
||||
|
||||
if (hdr->nm_status == NL_MMAP_STATUS_VALID) {
|
||||
/* Regular memory mapped frame */
|
||||
nlh = (void *)hdr + NL_MMAP_HDRLEN;
|
||||
len = hdr->nm_len;
|
||||
|
||||
/* Release empty message immediately. May happen
|
||||
* on error during message construction.
|
||||
*/
|
||||
if (len == 0)
|
||||
goto release;
|
||||
} else if (hdr->nm_status == NL_MMAP_STATUS_COPY) {
|
||||
/* Frame queued to socket receive queue */
|
||||
len = recv(fd, buf, sizeof(buf), MSG_DONTWAIT);
|
||||
if (len <= 0)
|
||||
break;
|
||||
nlh = buf;
|
||||
} else
|
||||
/* No more messages to process, continue polling */
|
||||
break;
|
||||
|
||||
process_msg(nlh);
|
||||
release:
|
||||
/* Release frame back to the kernel */
|
||||
hdr->nm_status = NL_MMAP_STATUS_UNUSED;
|
||||
|
||||
/* Advance frame offset to next frame */
|
||||
frame_offset = (frame_offset + frame_size) % ring_size;
|
||||
}
|
||||
}
|
||||
|
||||
Message transmission:
|
||||
|
||||
This example assumes some ring parameters of the ring setup are available.
|
||||
A single message is constructed and transmitted, to send multiple messages
|
||||
at once they would be constructed in consecutive frames before a final call
|
||||
to sendto().
|
||||
|
||||
unsigned int frame_offset = 0;
|
||||
struct nl_mmap_hdr *hdr;
|
||||
struct nlmsghdr *nlh;
|
||||
struct sockaddr_nl addr = {
|
||||
.nl_family = AF_NETLINK,
|
||||
};
|
||||
|
||||
hdr = tx_ring + frame_offset;
|
||||
if (hdr->nm_status != NL_MMAP_STATUS_UNUSED)
|
||||
/* No frame available. Use poll() to avoid. */
|
||||
exit(1);
|
||||
|
||||
nlh = (void *)hdr + NL_MMAP_HDRLEN;
|
||||
|
||||
/* Build message */
|
||||
build_message(nlh);
|
||||
|
||||
/* Fill frame header: length and status need to be set */
|
||||
hdr->nm_len = nlh->nlmsg_len;
|
||||
hdr->nm_status = NL_MMAP_STATUS_VALID;
|
||||
|
||||
if (sendto(fd, NULL, 0, 0, &addr, sizeof(addr)) < 0)
|
||||
exit(1);
|
||||
|
||||
/* Advance frame offset to next frame */
|
||||
frame_offset = (frame_offset + frame_size) % ring_size;
|
||||
176
Documentation/networking/nf_conntrack-sysctl.txt
Normal file
176
Documentation/networking/nf_conntrack-sysctl.txt
Normal file
|
|
@ -0,0 +1,176 @@
|
|||
/proc/sys/net/netfilter/nf_conntrack_* Variables:
|
||||
|
||||
nf_conntrack_acct - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
|
||||
Enable connection tracking flow accounting. 64-bit byte and packet
|
||||
counters per flow are added.
|
||||
|
||||
nf_conntrack_buckets - INTEGER (read-only)
|
||||
Size of hash table. If not specified as parameter during module
|
||||
loading, the default size is calculated by dividing total memory
|
||||
by 16384 to determine the number of buckets but the hash table will
|
||||
never have fewer than 32 or more than 16384 buckets.
|
||||
|
||||
nf_conntrack_checksum - BOOLEAN
|
||||
0 - disabled
|
||||
not 0 - enabled (default)
|
||||
|
||||
Verify checksum of incoming packets. Packets with bad checksums are
|
||||
in INVALID state. If this is enabled, such packets will not be
|
||||
considered for connection tracking.
|
||||
|
||||
nf_conntrack_count - INTEGER (read-only)
|
||||
Number of currently allocated flow entries.
|
||||
|
||||
nf_conntrack_events - BOOLEAN
|
||||
0 - disabled
|
||||
not 0 - enabled (default)
|
||||
|
||||
If this option is enabled, the connection tracking code will
|
||||
provide userspace with connection tracking events via ctnetlink.
|
||||
|
||||
nf_conntrack_events_retry_timeout - INTEGER (seconds)
|
||||
default 15
|
||||
|
||||
This option is only relevant when "reliable connection tracking
|
||||
events" are used. Normally, ctnetlink is "lossy", that is,
|
||||
events are normally dropped when userspace listeners can't keep up.
|
||||
|
||||
Userspace can request "reliable event mode". When this mode is
|
||||
active, the conntrack will only be destroyed after the event was
|
||||
delivered. If event delivery fails, the kernel periodically
|
||||
re-tries to send the event to userspace.
|
||||
|
||||
This is the maximum interval the kernel should use when re-trying
|
||||
to deliver the destroy event.
|
||||
|
||||
A higher number means there will be fewer delivery retries and it
|
||||
will take longer for a backlog to be processed.
|
||||
|
||||
nf_conntrack_expect_max - INTEGER
|
||||
Maximum size of expectation table. Default value is
|
||||
nf_conntrack_buckets / 256. Minimum is 1.
|
||||
|
||||
nf_conntrack_frag6_high_thresh - INTEGER
|
||||
default 262144
|
||||
|
||||
Maximum memory used to reassemble IPv6 fragments. When
|
||||
nf_conntrack_frag6_high_thresh bytes of memory is allocated for this
|
||||
purpose, the fragment handler will toss packets until
|
||||
nf_conntrack_frag6_low_thresh is reached.
|
||||
|
||||
nf_conntrack_frag6_low_thresh - INTEGER
|
||||
default 196608
|
||||
|
||||
See nf_conntrack_frag6_low_thresh
|
||||
|
||||
nf_conntrack_frag6_timeout - INTEGER (seconds)
|
||||
default 60
|
||||
|
||||
Time to keep an IPv6 fragment in memory.
|
||||
|
||||
nf_conntrack_generic_timeout - INTEGER (seconds)
|
||||
default 600
|
||||
|
||||
Default for generic timeout. This refers to layer 4 unknown/unsupported
|
||||
protocols.
|
||||
|
||||
nf_conntrack_helper - BOOLEAN
|
||||
0 - disabled
|
||||
not 0 - enabled (default)
|
||||
|
||||
Enable automatic conntrack helper assignment.
|
||||
|
||||
nf_conntrack_icmp_timeout - INTEGER (seconds)
|
||||
default 30
|
||||
|
||||
Default for ICMP timeout.
|
||||
|
||||
nf_conntrack_icmpv6_timeout - INTEGER (seconds)
|
||||
default 30
|
||||
|
||||
Default for ICMP6 timeout.
|
||||
|
||||
nf_conntrack_log_invalid - INTEGER
|
||||
0 - disable (default)
|
||||
1 - log ICMP packets
|
||||
6 - log TCP packets
|
||||
17 - log UDP packets
|
||||
33 - log DCCP packets
|
||||
41 - log ICMPv6 packets
|
||||
136 - log UDPLITE packets
|
||||
255 - log packets of any protocol
|
||||
|
||||
Log invalid packets of a type specified by value.
|
||||
|
||||
nf_conntrack_max - INTEGER
|
||||
Size of connection tracking table. Default value is
|
||||
nf_conntrack_buckets value * 4.
|
||||
|
||||
nf_conntrack_tcp_be_liberal - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
|
||||
Be conservative in what you do, be liberal in what you accept from others.
|
||||
If it's non-zero, we mark only out of window RST segments as INVALID.
|
||||
|
||||
nf_conntrack_tcp_loose - BOOLEAN
|
||||
0 - disabled
|
||||
not 0 - enabled (default)
|
||||
|
||||
If it is set to zero, we disable picking up already established
|
||||
connections.
|
||||
|
||||
nf_conntrack_tcp_max_retrans - INTEGER
|
||||
default 3
|
||||
|
||||
Maximum number of packets that can be retransmitted without
|
||||
received an (acceptable) ACK from the destination. If this number
|
||||
is reached, a shorter timer will be started.
|
||||
|
||||
nf_conntrack_tcp_timeout_close - INTEGER (seconds)
|
||||
default 10
|
||||
|
||||
nf_conntrack_tcp_timeout_close_wait - INTEGER (seconds)
|
||||
default 60
|
||||
|
||||
nf_conntrack_tcp_timeout_established - INTEGER (seconds)
|
||||
default 432000 (5 days)
|
||||
|
||||
nf_conntrack_tcp_timeout_fin_wait - INTEGER (seconds)
|
||||
default 120
|
||||
|
||||
nf_conntrack_tcp_timeout_last_ack - INTEGER (seconds)
|
||||
default 30
|
||||
|
||||
nf_conntrack_tcp_timeout_max_retrans - INTEGER (seconds)
|
||||
default 300
|
||||
|
||||
nf_conntrack_tcp_timeout_syn_recv - INTEGER (seconds)
|
||||
default 60
|
||||
|
||||
nf_conntrack_tcp_timeout_syn_sent - INTEGER (seconds)
|
||||
default 120
|
||||
|
||||
nf_conntrack_tcp_timeout_time_wait - INTEGER (seconds)
|
||||
default 120
|
||||
|
||||
nf_conntrack_tcp_timeout_unacknowledged - INTEGER (seconds)
|
||||
default 300
|
||||
|
||||
nf_conntrack_timestamp - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
|
||||
Enable connection tracking flow timestamping.
|
||||
|
||||
nf_conntrack_udp_timeout - INTEGER (seconds)
|
||||
default 30
|
||||
|
||||
nf_conntrack_udp_timeout_stream2 - INTEGER (seconds)
|
||||
default 180
|
||||
|
||||
This extended timeout will be used in case there is an UDP stream
|
||||
detected.
|
||||
128
Documentation/networking/nfc.txt
Normal file
128
Documentation/networking/nfc.txt
Normal file
|
|
@ -0,0 +1,128 @@
|
|||
Linux NFC subsystem
|
||||
===================
|
||||
|
||||
The Near Field Communication (NFC) subsystem is required to standardize the
|
||||
NFC device drivers development and to create an unified userspace interface.
|
||||
|
||||
This document covers the architecture overview, the device driver interface
|
||||
description and the userspace interface description.
|
||||
|
||||
Architecture overview
|
||||
---------------------
|
||||
|
||||
The NFC subsystem is responsible for:
|
||||
- NFC adapters management;
|
||||
- Polling for targets;
|
||||
- Low-level data exchange;
|
||||
|
||||
The subsystem is divided in some parts. The 'core' is responsible for
|
||||
providing the device driver interface. On the other side, it is also
|
||||
responsible for providing an interface to control operations and low-level
|
||||
data exchange.
|
||||
|
||||
The control operations are available to userspace via generic netlink.
|
||||
|
||||
The low-level data exchange interface is provided by the new socket family
|
||||
PF_NFC. The NFC_SOCKPROTO_RAW performs raw communication with NFC targets.
|
||||
|
||||
|
||||
+--------------------------------------+
|
||||
| USER SPACE |
|
||||
+--------------------------------------+
|
||||
^ ^
|
||||
| low-level | control
|
||||
| data exchange | operations
|
||||
| |
|
||||
| v
|
||||
| +-----------+
|
||||
| AF_NFC | netlink |
|
||||
| socket +-----------+
|
||||
| raw ^
|
||||
| |
|
||||
v v
|
||||
+---------+ +-----------+
|
||||
| rawsock | <--------> | core |
|
||||
+---------+ +-----------+
|
||||
^
|
||||
|
|
||||
v
|
||||
+-----------+
|
||||
| driver |
|
||||
+-----------+
|
||||
|
||||
Device Driver Interface
|
||||
-----------------------
|
||||
|
||||
When registering on the NFC subsystem, the device driver must inform the core
|
||||
of the set of supported NFC protocols and the set of ops callbacks. The ops
|
||||
callbacks that must be implemented are the following:
|
||||
|
||||
* start_poll - setup the device to poll for targets
|
||||
* stop_poll - stop on progress polling operation
|
||||
* activate_target - select and initialize one of the targets found
|
||||
* deactivate_target - deselect and deinitialize the selected target
|
||||
* data_exchange - send data and receive the response (transceive operation)
|
||||
|
||||
Userspace interface
|
||||
--------------------
|
||||
|
||||
The userspace interface is divided in control operations and low-level data
|
||||
exchange operation.
|
||||
|
||||
CONTROL OPERATIONS:
|
||||
|
||||
Generic netlink is used to implement the interface to the control operations.
|
||||
The operations are composed by commands and events, all listed below:
|
||||
|
||||
* NFC_CMD_GET_DEVICE - get specific device info or dump the device list
|
||||
* NFC_CMD_START_POLL - setup a specific device to polling for targets
|
||||
* NFC_CMD_STOP_POLL - stop the polling operation in a specific device
|
||||
* NFC_CMD_GET_TARGET - dump the list of targets found by a specific device
|
||||
|
||||
* NFC_EVENT_DEVICE_ADDED - reports an NFC device addition
|
||||
* NFC_EVENT_DEVICE_REMOVED - reports an NFC device removal
|
||||
* NFC_EVENT_TARGETS_FOUND - reports START_POLL results when 1 or more targets
|
||||
are found
|
||||
|
||||
The user must call START_POLL to poll for NFC targets, passing the desired NFC
|
||||
protocols through NFC_ATTR_PROTOCOLS attribute. The device remains in polling
|
||||
state until it finds any target. However, the user can stop the polling
|
||||
operation by calling STOP_POLL command. In this case, it will be checked if
|
||||
the requester of STOP_POLL is the same of START_POLL.
|
||||
|
||||
If the polling operation finds one or more targets, the event TARGETS_FOUND is
|
||||
sent (including the device id). The user must call GET_TARGET to get the list of
|
||||
all targets found by such device. Each reply message has target attributes with
|
||||
relevant information such as the supported NFC protocols.
|
||||
|
||||
All polling operations requested through one netlink socket are stopped when
|
||||
it's closed.
|
||||
|
||||
LOW-LEVEL DATA EXCHANGE:
|
||||
|
||||
The userspace must use PF_NFC sockets to perform any data communication with
|
||||
targets. All NFC sockets use AF_NFC:
|
||||
|
||||
struct sockaddr_nfc {
|
||||
sa_family_t sa_family;
|
||||
__u32 dev_idx;
|
||||
__u32 target_idx;
|
||||
__u32 nfc_protocol;
|
||||
};
|
||||
|
||||
To establish a connection with one target, the user must create an
|
||||
NFC_SOCKPROTO_RAW socket and call the 'connect' syscall with the sockaddr_nfc
|
||||
struct correctly filled. All information comes from NFC_EVENT_TARGETS_FOUND
|
||||
netlink event. As a target can support more than one NFC protocol, the user
|
||||
must inform which protocol it wants to use.
|
||||
|
||||
Internally, 'connect' will result in an activate_target call to the driver.
|
||||
When the socket is closed, the target is deactivated.
|
||||
|
||||
The data format exchanged through the sockets is NFC protocol dependent. For
|
||||
instance, when communicating with MIFARE tags, the data exchanged are MIFARE
|
||||
commands and their responses.
|
||||
|
||||
The first received package is the response to the first sent package and so
|
||||
on. In order to allow valid "empty" responses, every data received has a NULL
|
||||
header of 1 byte.
|
||||
235
Documentation/networking/openvswitch.txt
Normal file
235
Documentation/networking/openvswitch.txt
Normal file
|
|
@ -0,0 +1,235 @@
|
|||
Open vSwitch datapath developer documentation
|
||||
=============================================
|
||||
|
||||
The Open vSwitch kernel module allows flexible userspace control over
|
||||
flow-level packet processing on selected network devices. It can be
|
||||
used to implement a plain Ethernet switch, network device bonding,
|
||||
VLAN processing, network access control, flow-based network control,
|
||||
and so on.
|
||||
|
||||
The kernel module implements multiple "datapaths" (analogous to
|
||||
bridges), each of which can have multiple "vports" (analogous to ports
|
||||
within a bridge). Each datapath also has associated with it a "flow
|
||||
table" that userspace populates with "flows" that map from keys based
|
||||
on packet headers and metadata to sets of actions. The most common
|
||||
action forwards the packet to another vport; other actions are also
|
||||
implemented.
|
||||
|
||||
When a packet arrives on a vport, the kernel module processes it by
|
||||
extracting its flow key and looking it up in the flow table. If there
|
||||
is a matching flow, it executes the associated actions. If there is
|
||||
no match, it queues the packet to userspace for processing (as part of
|
||||
its processing, userspace will likely set up a flow to handle further
|
||||
packets of the same type entirely in-kernel).
|
||||
|
||||
|
||||
Flow key compatibility
|
||||
----------------------
|
||||
|
||||
Network protocols evolve over time. New protocols become important
|
||||
and existing protocols lose their prominence. For the Open vSwitch
|
||||
kernel module to remain relevant, it must be possible for newer
|
||||
versions to parse additional protocols as part of the flow key. It
|
||||
might even be desirable, someday, to drop support for parsing
|
||||
protocols that have become obsolete. Therefore, the Netlink interface
|
||||
to Open vSwitch is designed to allow carefully written userspace
|
||||
applications to work with any version of the flow key, past or future.
|
||||
|
||||
To support this forward and backward compatibility, whenever the
|
||||
kernel module passes a packet to userspace, it also passes along the
|
||||
flow key that it parsed from the packet. Userspace then extracts its
|
||||
own notion of a flow key from the packet and compares it against the
|
||||
kernel-provided version:
|
||||
|
||||
- If userspace's notion of the flow key for the packet matches the
|
||||
kernel's, then nothing special is necessary.
|
||||
|
||||
- If the kernel's flow key includes more fields than the userspace
|
||||
version of the flow key, for example if the kernel decoded IPv6
|
||||
headers but userspace stopped at the Ethernet type (because it
|
||||
does not understand IPv6), then again nothing special is
|
||||
necessary. Userspace can still set up a flow in the usual way,
|
||||
as long as it uses the kernel-provided flow key to do it.
|
||||
|
||||
- If the userspace flow key includes more fields than the
|
||||
kernel's, for example if userspace decoded an IPv6 header but
|
||||
the kernel stopped at the Ethernet type, then userspace can
|
||||
forward the packet manually, without setting up a flow in the
|
||||
kernel. This case is bad for performance because every packet
|
||||
that the kernel considers part of the flow must go to userspace,
|
||||
but the forwarding behavior is correct. (If userspace can
|
||||
determine that the values of the extra fields would not affect
|
||||
forwarding behavior, then it could set up a flow anyway.)
|
||||
|
||||
How flow keys evolve over time is important to making this work, so
|
||||
the following sections go into detail.
|
||||
|
||||
|
||||
Flow key format
|
||||
---------------
|
||||
|
||||
A flow key is passed over a Netlink socket as a sequence of Netlink
|
||||
attributes. Some attributes represent packet metadata, defined as any
|
||||
information about a packet that cannot be extracted from the packet
|
||||
itself, e.g. the vport on which the packet was received. Most
|
||||
attributes, however, are extracted from headers within the packet,
|
||||
e.g. source and destination addresses from Ethernet, IP, or TCP
|
||||
headers.
|
||||
|
||||
The <linux/openvswitch.h> header file defines the exact format of the
|
||||
flow key attributes. For informal explanatory purposes here, we write
|
||||
them as comma-separated strings, with parentheses indicating arguments
|
||||
and nesting. For example, the following could represent a flow key
|
||||
corresponding to a TCP packet that arrived on vport 1:
|
||||
|
||||
in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4),
|
||||
eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0,
|
||||
frag=no), tcp(src=49163, dst=80)
|
||||
|
||||
Often we ellipsize arguments not important to the discussion, e.g.:
|
||||
|
||||
in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...)
|
||||
|
||||
|
||||
Wildcarded flow key format
|
||||
--------------------------
|
||||
|
||||
A wildcarded flow is described with two sequences of Netlink attributes
|
||||
passed over the Netlink socket. A flow key, exactly as described above, and an
|
||||
optional corresponding flow mask.
|
||||
|
||||
A wildcarded flow can represent a group of exact match flows. Each '1' bit
|
||||
in the mask specifies a exact match with the corresponding bit in the flow key.
|
||||
A '0' bit specifies a don't care bit, which will match either a '1' or '0' bit
|
||||
of a incoming packet. Using wildcarded flow can improve the flow set up rate
|
||||
by reduce the number of new flows need to be processed by the user space program.
|
||||
|
||||
Support for the mask Netlink attribute is optional for both the kernel and user
|
||||
space program. The kernel can ignore the mask attribute, installing an exact
|
||||
match flow, or reduce the number of don't care bits in the kernel to less than
|
||||
what was specified by the user space program. In this case, variations in bits
|
||||
that the kernel does not implement will simply result in additional flow setups.
|
||||
The kernel module will also work with user space programs that neither support
|
||||
nor supply flow mask attributes.
|
||||
|
||||
Since the kernel may ignore or modify wildcard bits, it can be difficult for
|
||||
the userspace program to know exactly what matches are installed. There are
|
||||
two possible approaches: reactively install flows as they miss the kernel
|
||||
flow table (and therefore not attempt to determine wildcard changes at all)
|
||||
or use the kernel's response messages to determine the installed wildcards.
|
||||
|
||||
When interacting with userspace, the kernel should maintain the match portion
|
||||
of the key exactly as originally installed. This will provides a handle to
|
||||
identify the flow for all future operations. However, when reporting the
|
||||
mask of an installed flow, the mask should include any restrictions imposed
|
||||
by the kernel.
|
||||
|
||||
The behavior when using overlapping wildcarded flows is undefined. It is the
|
||||
responsibility of the user space program to ensure that any incoming packet
|
||||
can match at most one flow, wildcarded or not. The current implementation
|
||||
performs best-effort detection of overlapping wildcarded flows and may reject
|
||||
some but not all of them. However, this behavior may change in future versions.
|
||||
|
||||
|
||||
Basic rule for evolving flow keys
|
||||
---------------------------------
|
||||
|
||||
Some care is needed to really maintain forward and backward
|
||||
compatibility for applications that follow the rules listed under
|
||||
"Flow key compatibility" above.
|
||||
|
||||
The basic rule is obvious:
|
||||
|
||||
------------------------------------------------------------------
|
||||
New network protocol support must only supplement existing flow
|
||||
key attributes. It must not change the meaning of already defined
|
||||
flow key attributes.
|
||||
------------------------------------------------------------------
|
||||
|
||||
This rule does have less-obvious consequences so it is worth working
|
||||
through a few examples. Suppose, for example, that the kernel module
|
||||
did not already implement VLAN parsing. Instead, it just interpreted
|
||||
the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the
|
||||
packet. The flow key for any packet with an 802.1Q header would look
|
||||
essentially like this, ignoring metadata:
|
||||
|
||||
eth(...), eth_type(0x8100)
|
||||
|
||||
Naively, to add VLAN support, it makes sense to add a new "vlan" flow
|
||||
key attribute to contain the VLAN tag, then continue to decode the
|
||||
encapsulated headers beyond the VLAN tag using the existing field
|
||||
definitions. With this change, a TCP packet in VLAN 10 would have a
|
||||
flow key much like this:
|
||||
|
||||
eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...)
|
||||
|
||||
But this change would negatively affect a userspace application that
|
||||
has not been updated to understand the new "vlan" flow key attribute.
|
||||
The application could, following the flow compatibility rules above,
|
||||
ignore the "vlan" attribute that it does not understand and therefore
|
||||
assume that the flow contained IP packets. This is a bad assumption
|
||||
(the flow only contains IP packets if one parses and skips over the
|
||||
802.1Q header) and it could cause the application's behavior to change
|
||||
across kernel versions even though it follows the compatibility rules.
|
||||
|
||||
The solution is to use a set of nested attributes. This is, for
|
||||
example, why 802.1Q support uses nested attributes. A TCP packet in
|
||||
VLAN 10 is actually expressed as:
|
||||
|
||||
eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800),
|
||||
ip(proto=6, ...), tcp(...)))
|
||||
|
||||
Notice how the "eth_type", "ip", and "tcp" flow key attributes are
|
||||
nested inside the "encap" attribute. Thus, an application that does
|
||||
not understand the "vlan" key will not see either of those attributes
|
||||
and therefore will not misinterpret them. (Also, the outer eth_type
|
||||
is still 0x8100, not changed to 0x0800.)
|
||||
|
||||
Handling malformed packets
|
||||
--------------------------
|
||||
|
||||
Don't drop packets in the kernel for malformed protocol headers, bad
|
||||
checksums, etc. This would prevent userspace from implementing a
|
||||
simple Ethernet switch that forwards every packet.
|
||||
|
||||
Instead, in such a case, include an attribute with "empty" content.
|
||||
It doesn't matter if the empty content could be valid protocol values,
|
||||
as long as those values are rarely seen in practice, because userspace
|
||||
can always forward all packets with those values to userspace and
|
||||
handle them individually.
|
||||
|
||||
For example, consider a packet that contains an IP header that
|
||||
indicates protocol 6 for TCP, but which is truncated just after the IP
|
||||
header, so that the TCP header is missing. The flow key for this
|
||||
packet would include a tcp attribute with all-zero src and dst, like
|
||||
this:
|
||||
|
||||
eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
|
||||
|
||||
As another example, consider a packet with an Ethernet type of 0x8100,
|
||||
indicating that a VLAN TCI should follow, but which is truncated just
|
||||
after the Ethernet type. The flow key for this packet would include
|
||||
an all-zero-bits vlan and an empty encap attribute, like this:
|
||||
|
||||
eth(...), eth_type(0x8100), vlan(0), encap()
|
||||
|
||||
Unlike a TCP packet with source and destination ports 0, an
|
||||
all-zero-bits VLAN TCI is not that rare, so the CFI bit (aka
|
||||
VLAN_TAG_PRESENT inside the kernel) is ordinarily set in a vlan
|
||||
attribute expressly to allow this situation to be distinguished.
|
||||
Thus, the flow key in this second example unambiguously indicates a
|
||||
missing or malformed VLAN TCI.
|
||||
|
||||
Other rules
|
||||
-----------
|
||||
|
||||
The other rules for flow keys are much less subtle:
|
||||
|
||||
- Duplicate attributes are not allowed at a given nesting level.
|
||||
|
||||
- Ordering of attributes is not significant.
|
||||
|
||||
- When the kernel sends a given flow key to userspace, it always
|
||||
composes it the same way. This allows userspace to hash and
|
||||
compare entire flow keys that it may not be able to fully
|
||||
interpret.
|
||||
162
Documentation/networking/operstates.txt
Normal file
162
Documentation/networking/operstates.txt
Normal file
|
|
@ -0,0 +1,162 @@
|
|||
|
||||
1. Introduction
|
||||
|
||||
Linux distinguishes between administrative and operational state of an
|
||||
interface. Administrative state is the result of "ip link set dev
|
||||
<dev> up or down" and reflects whether the administrator wants to use
|
||||
the device for traffic.
|
||||
|
||||
However, an interface is not usable just because the admin enabled it
|
||||
- ethernet requires to be plugged into the switch and, depending on
|
||||
a site's networking policy and configuration, an 802.1X authentication
|
||||
to be performed before user data can be transferred. Operational state
|
||||
shows the ability of an interface to transmit this user data.
|
||||
|
||||
Thanks to 802.1X, userspace must be granted the possibility to
|
||||
influence operational state. To accommodate this, operational state is
|
||||
split into two parts: Two flags that can be set by the driver only, and
|
||||
a RFC2863 compatible state that is derived from these flags, a policy,
|
||||
and changeable from userspace under certain rules.
|
||||
|
||||
|
||||
2. Querying from userspace
|
||||
|
||||
Both admin and operational state can be queried via the netlink
|
||||
operation RTM_GETLINK. It is also possible to subscribe to RTMGRP_LINK
|
||||
to be notified of updates. This is important for setting from userspace.
|
||||
|
||||
These values contain interface state:
|
||||
|
||||
ifinfomsg::if_flags & IFF_UP:
|
||||
Interface is admin up
|
||||
ifinfomsg::if_flags & IFF_RUNNING:
|
||||
Interface is in RFC2863 operational state UP or UNKNOWN. This is for
|
||||
backward compatibility, routing daemons, dhcp clients can use this
|
||||
flag to determine whether they should use the interface.
|
||||
ifinfomsg::if_flags & IFF_LOWER_UP:
|
||||
Driver has signaled netif_carrier_on()
|
||||
ifinfomsg::if_flags & IFF_DORMANT:
|
||||
Driver has signaled netif_dormant_on()
|
||||
|
||||
TLV IFLA_OPERSTATE
|
||||
|
||||
contains RFC2863 state of the interface in numeric representation:
|
||||
|
||||
IF_OPER_UNKNOWN (0):
|
||||
Interface is in unknown state, neither driver nor userspace has set
|
||||
operational state. Interface must be considered for user data as
|
||||
setting operational state has not been implemented in every driver.
|
||||
IF_OPER_NOTPRESENT (1):
|
||||
Unused in current kernel (notpresent interfaces normally disappear),
|
||||
just a numerical placeholder.
|
||||
IF_OPER_DOWN (2):
|
||||
Interface is unable to transfer data on L1, f.e. ethernet is not
|
||||
plugged or interface is ADMIN down.
|
||||
IF_OPER_LOWERLAYERDOWN (3):
|
||||
Interfaces stacked on an interface that is IF_OPER_DOWN show this
|
||||
state (f.e. VLAN).
|
||||
IF_OPER_TESTING (4):
|
||||
Unused in current kernel.
|
||||
IF_OPER_DORMANT (5):
|
||||
Interface is L1 up, but waiting for an external event, f.e. for a
|
||||
protocol to establish. (802.1X)
|
||||
IF_OPER_UP (6):
|
||||
Interface is operational up and can be used.
|
||||
|
||||
This TLV can also be queried via sysfs.
|
||||
|
||||
TLV IFLA_LINKMODE
|
||||
|
||||
contains link policy. This is needed for userspace interaction
|
||||
described below.
|
||||
|
||||
This TLV can also be queried via sysfs.
|
||||
|
||||
|
||||
3. Kernel driver API
|
||||
|
||||
Kernel drivers have access to two flags that map to IFF_LOWER_UP and
|
||||
IFF_DORMANT. These flags can be set from everywhere, even from
|
||||
interrupts. It is guaranteed that only the driver has write access,
|
||||
however, if different layers of the driver manipulate the same flag,
|
||||
the driver has to provide the synchronisation needed.
|
||||
|
||||
__LINK_STATE_NOCARRIER, maps to !IFF_LOWER_UP:
|
||||
|
||||
The driver uses netif_carrier_on() to clear and netif_carrier_off() to
|
||||
set this flag. On netif_carrier_off(), the scheduler stops sending
|
||||
packets. The name 'carrier' and the inversion are historical, think of
|
||||
it as lower layer.
|
||||
|
||||
Note that for certain kind of soft-devices, which are not managing any
|
||||
real hardware, it is possible to set this bit from userspace. One
|
||||
should use TVL IFLA_CARRIER to do so.
|
||||
|
||||
netif_carrier_ok() can be used to query that bit.
|
||||
|
||||
__LINK_STATE_DORMANT, maps to IFF_DORMANT:
|
||||
|
||||
Set by the driver to express that the device cannot yet be used
|
||||
because some driver controlled protocol establishment has to
|
||||
complete. Corresponding functions are netif_dormant_on() to set the
|
||||
flag, netif_dormant_off() to clear it and netif_dormant() to query.
|
||||
|
||||
On device allocation, networking core sets the flags equivalent to
|
||||
netif_carrier_ok() and !netif_dormant().
|
||||
|
||||
|
||||
Whenever the driver CHANGES one of these flags, a workqueue event is
|
||||
scheduled to translate the flag combination to IFLA_OPERSTATE as
|
||||
follows:
|
||||
|
||||
!netif_carrier_ok():
|
||||
IF_OPER_LOWERLAYERDOWN if the interface is stacked, IF_OPER_DOWN
|
||||
otherwise. Kernel can recognise stacked interfaces because their
|
||||
ifindex != iflink.
|
||||
|
||||
netif_carrier_ok() && netif_dormant():
|
||||
IF_OPER_DORMANT
|
||||
|
||||
netif_carrier_ok() && !netif_dormant():
|
||||
IF_OPER_UP if userspace interaction is disabled. Otherwise
|
||||
IF_OPER_DORMANT with the possibility for userspace to initiate the
|
||||
IF_OPER_UP transition afterwards.
|
||||
|
||||
|
||||
4. Setting from userspace
|
||||
|
||||
Applications have to use the netlink interface to influence the
|
||||
RFC2863 operational state of an interface. Setting IFLA_LINKMODE to 1
|
||||
via RTM_SETLINK instructs the kernel that an interface should go to
|
||||
IF_OPER_DORMANT instead of IF_OPER_UP when the combination
|
||||
netif_carrier_ok() && !netif_dormant() is set by the
|
||||
driver. Afterwards, the userspace application can set IFLA_OPERSTATE
|
||||
to IF_OPER_DORMANT or IF_OPER_UP as long as the driver does not set
|
||||
netif_carrier_off() or netif_dormant_on(). Changes made by userspace
|
||||
are multicasted on the netlink group RTMGRP_LINK.
|
||||
|
||||
So basically a 802.1X supplicant interacts with the kernel like this:
|
||||
|
||||
-subscribe to RTMGRP_LINK
|
||||
-set IFLA_LINKMODE to 1 via RTM_SETLINK
|
||||
-query RTM_GETLINK once to get initial state
|
||||
-if initial flags are not (IFF_LOWER_UP && !IFF_DORMANT), wait until
|
||||
netlink multicast signals this state
|
||||
-do 802.1X, eventually abort if flags go down again
|
||||
-send RTM_SETLINK to set operstate to IF_OPER_UP if authentication
|
||||
succeeds, IF_OPER_DORMANT otherwise
|
||||
-see how operstate and IFF_RUNNING is echoed via netlink multicast
|
||||
-set interface back to IF_OPER_DORMANT if 802.1X reauthentication
|
||||
fails
|
||||
-restart if kernel changes IFF_LOWER_UP or IFF_DORMANT flag
|
||||
|
||||
if supplicant goes down, bring back IFLA_LINKMODE to 0 and
|
||||
IFLA_OPERSTATE to a sane value.
|
||||
|
||||
A routing daemon or dhcp client just needs to care for IFF_RUNNING or
|
||||
waiting for operstate to go IF_OPER_UP/IF_OPER_UNKNOWN before
|
||||
considering the interface / querying a DHCP address.
|
||||
|
||||
|
||||
For technical questions and/or comments please e-mail to Stefan Rompf
|
||||
(stefan at loplof.de).
|
||||
1061
Documentation/networking/packet_mmap.txt
Normal file
1061
Documentation/networking/packet_mmap.txt
Normal file
File diff suppressed because it is too large
Load diff
214
Documentation/networking/phonet.txt
Normal file
214
Documentation/networking/phonet.txt
Normal file
|
|
@ -0,0 +1,214 @@
|
|||
Linux Phonet protocol family
|
||||
============================
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
Phonet is a packet protocol used by Nokia cellular modems for both IPC
|
||||
and RPC. With the Linux Phonet socket family, Linux host processes can
|
||||
receive and send messages from/to the modem, or any other external
|
||||
device attached to the modem. The modem takes care of routing.
|
||||
|
||||
Phonet packets can be exchanged through various hardware connections
|
||||
depending on the device, such as:
|
||||
- USB with the CDC Phonet interface,
|
||||
- infrared,
|
||||
- Bluetooth,
|
||||
- an RS232 serial port (with a dedicated "FBUS" line discipline),
|
||||
- the SSI bus with some TI OMAP processors.
|
||||
|
||||
|
||||
Packets format
|
||||
--------------
|
||||
|
||||
Phonet packets have a common header as follows:
|
||||
|
||||
struct phonethdr {
|
||||
uint8_t pn_media; /* Media type (link-layer identifier) */
|
||||
uint8_t pn_rdev; /* Receiver device ID */
|
||||
uint8_t pn_sdev; /* Sender device ID */
|
||||
uint8_t pn_res; /* Resource ID or function */
|
||||
uint16_t pn_length; /* Big-endian message byte length (minus 6) */
|
||||
uint8_t pn_robj; /* Receiver object ID */
|
||||
uint8_t pn_sobj; /* Sender object ID */
|
||||
};
|
||||
|
||||
On Linux, the link-layer header includes the pn_media byte (see below).
|
||||
The next 7 bytes are part of the network-layer header.
|
||||
|
||||
The device ID is split: the 6 higher-order bits constitute the device
|
||||
address, while the 2 lower-order bits are used for multiplexing, as are
|
||||
the 8-bit object identifiers. As such, Phonet can be considered as a
|
||||
network layer with 6 bits of address space and 10 bits for transport
|
||||
protocol (much like port numbers in IP world).
|
||||
|
||||
The modem always has address number zero. All other device have a their
|
||||
own 6-bit address.
|
||||
|
||||
|
||||
Link layer
|
||||
----------
|
||||
|
||||
Phonet links are always point-to-point links. The link layer header
|
||||
consists of a single Phonet media type byte. It uniquely identifies the
|
||||
link through which the packet is transmitted, from the modem's
|
||||
perspective. Each Phonet network device shall prepend and set the media
|
||||
type byte as appropriate. For convenience, a common phonet_header_ops
|
||||
link-layer header operations structure is provided. It sets the
|
||||
media type according to the network device hardware address.
|
||||
|
||||
Linux Phonet network interfaces support a dedicated link layer packets
|
||||
type (ETH_P_PHONET) which is out of the Ethernet type range. They can
|
||||
only send and receive Phonet packets.
|
||||
|
||||
The virtual TUN tunnel device driver can also be used for Phonet. This
|
||||
requires IFF_TUN mode, _without_ the IFF_NO_PI flag. In this case,
|
||||
there is no link-layer header, so there is no Phonet media type byte.
|
||||
|
||||
Note that Phonet interfaces are not allowed to re-order packets, so
|
||||
only the (default) Linux FIFO qdisc should be used with them.
|
||||
|
||||
|
||||
Network layer
|
||||
-------------
|
||||
|
||||
The Phonet socket address family maps the Phonet packet header:
|
||||
|
||||
struct sockaddr_pn {
|
||||
sa_family_t spn_family; /* AF_PHONET */
|
||||
uint8_t spn_obj; /* Object ID */
|
||||
uint8_t spn_dev; /* Device ID */
|
||||
uint8_t spn_resource; /* Resource or function */
|
||||
uint8_t spn_zero[...]; /* Padding */
|
||||
};
|
||||
|
||||
The resource field is only used when sending and receiving;
|
||||
It is ignored by bind() and getsockname().
|
||||
|
||||
|
||||
Low-level datagram protocol
|
||||
---------------------------
|
||||
|
||||
Applications can send Phonet messages using the Phonet datagram socket
|
||||
protocol from the PF_PHONET family. Each socket is bound to one of the
|
||||
2^10 object IDs available, and can send and receive packets with any
|
||||
other peer.
|
||||
|
||||
struct sockaddr_pn addr = { .spn_family = AF_PHONET, };
|
||||
ssize_t len;
|
||||
socklen_t addrlen = sizeof(addr);
|
||||
int fd;
|
||||
|
||||
fd = socket(PF_PHONET, SOCK_DGRAM, 0);
|
||||
bind(fd, (struct sockaddr *)&addr, sizeof(addr));
|
||||
/* ... */
|
||||
|
||||
sendto(fd, msg, msglen, 0, (struct sockaddr *)&addr, sizeof(addr));
|
||||
len = recvfrom(fd, buf, sizeof(buf), 0,
|
||||
(struct sockaddr *)&addr, &addrlen);
|
||||
|
||||
This protocol follows the SOCK_DGRAM connection-less semantics.
|
||||
However, connect() and getpeername() are not supported, as they did
|
||||
not seem useful with Phonet usages (could be added easily).
|
||||
|
||||
|
||||
Resource subscription
|
||||
---------------------
|
||||
|
||||
A Phonet datagram socket can be subscribed to any number of 8-bits
|
||||
Phonet resources, as follow:
|
||||
|
||||
uint32_t res = 0xXX;
|
||||
ioctl(fd, SIOCPNADDRESOURCE, &res);
|
||||
|
||||
Subscription is similarly cancelled using the SIOCPNDELRESOURCE I/O
|
||||
control request, or when the socket is closed.
|
||||
|
||||
Note that no more than one socket can be subcribed to any given
|
||||
resource at a time. If not, ioctl() will return EBUSY.
|
||||
|
||||
|
||||
Phonet Pipe protocol
|
||||
--------------------
|
||||
|
||||
The Phonet Pipe protocol is a simple sequenced packets protocol
|
||||
with end-to-end congestion control. It uses the passive listening
|
||||
socket paradigm. The listening socket is bound to an unique free object
|
||||
ID. Each listening socket can handle up to 255 simultaneous
|
||||
connections, one per accept()'d socket.
|
||||
|
||||
int lfd, cfd;
|
||||
|
||||
lfd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE);
|
||||
listen (lfd, INT_MAX);
|
||||
|
||||
/* ... */
|
||||
cfd = accept(lfd, NULL, NULL);
|
||||
for (;;)
|
||||
{
|
||||
char buf[...];
|
||||
ssize_t len = read(cfd, buf, sizeof(buf));
|
||||
|
||||
/* ... */
|
||||
|
||||
write(cfd, msg, msglen);
|
||||
}
|
||||
|
||||
Connections are traditionally established between two endpoints by a
|
||||
"third party" application. This means that both endpoints are passive.
|
||||
|
||||
|
||||
As of Linux kernel version 2.6.39, it is also possible to connect
|
||||
two endpoints directly, using connect() on the active side. This is
|
||||
intended to support the newer Nokia Wireless Modem API, as found in
|
||||
e.g. the Nokia Slim Modem in the ST-Ericsson U8500 platform:
|
||||
|
||||
struct sockaddr_spn spn;
|
||||
int fd;
|
||||
|
||||
fd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE);
|
||||
memset(&spn, 0, sizeof(spn));
|
||||
spn.spn_family = AF_PHONET;
|
||||
spn.spn_obj = ...;
|
||||
spn.spn_dev = ...;
|
||||
spn.spn_resource = 0xD9;
|
||||
connect(fd, (struct sockaddr *)&spn, sizeof(spn));
|
||||
/* normal I/O here ... */
|
||||
close(fd);
|
||||
|
||||
|
||||
WARNING:
|
||||
When polling a connected pipe socket for writability, there is an
|
||||
intrinsic race condition whereby writability might be lost between the
|
||||
polling and the writing system calls. In this case, the socket will
|
||||
block until write becomes possible again, unless non-blocking mode
|
||||
is enabled.
|
||||
|
||||
|
||||
The pipe protocol provides two socket options at the SOL_PNPIPE level:
|
||||
|
||||
PNPIPE_ENCAP accepts one integer value (int) of:
|
||||
|
||||
PNPIPE_ENCAP_NONE: The socket operates normally (default).
|
||||
|
||||
PNPIPE_ENCAP_IP: The socket is used as a backend for a virtual IP
|
||||
interface. This requires CAP_NET_ADMIN capability. GPRS data
|
||||
support on Nokia modems can use this. Note that the socket cannot
|
||||
be reliably poll()'d or read() from while in this mode.
|
||||
|
||||
PNPIPE_IFINDEX is a read-only integer value. It contains the
|
||||
interface index of the network interface created by PNPIPE_ENCAP,
|
||||
or zero if encapsulation is off.
|
||||
|
||||
PNPIPE_HANDLE is a read-only integer value. It contains the underlying
|
||||
identifier ("pipe handle") of the pipe. This is only defined for
|
||||
socket descriptors that are already connected or being connected.
|
||||
|
||||
|
||||
Authors
|
||||
-------
|
||||
|
||||
Linux Phonet was initially written by Sakari Ailus.
|
||||
Other contributors include Mikä Liljeberg, Andras Domokos,
|
||||
Carlos Chinea and Rémi Denis-Courmont.
|
||||
Copyright (C) 2008 Nokia Corporation.
|
||||
339
Documentation/networking/phy.txt
Normal file
339
Documentation/networking/phy.txt
Normal file
|
|
@ -0,0 +1,339 @@
|
|||
|
||||
-------
|
||||
PHY Abstraction Layer
|
||||
(Updated 2008-04-08)
|
||||
|
||||
Purpose
|
||||
|
||||
Most network devices consist of set of registers which provide an interface
|
||||
to a MAC layer, which communicates with the physical connection through a
|
||||
PHY. The PHY concerns itself with negotiating link parameters with the link
|
||||
partner on the other side of the network connection (typically, an ethernet
|
||||
cable), and provides a register interface to allow drivers to determine what
|
||||
settings were chosen, and to configure what settings are allowed.
|
||||
|
||||
While these devices are distinct from the network devices, and conform to a
|
||||
standard layout for the registers, it has been common practice to integrate
|
||||
the PHY management code with the network driver. This has resulted in large
|
||||
amounts of redundant code. Also, on embedded systems with multiple (and
|
||||
sometimes quite different) ethernet controllers connected to the same
|
||||
management bus, it is difficult to ensure safe use of the bus.
|
||||
|
||||
Since the PHYs are devices, and the management busses through which they are
|
||||
accessed are, in fact, busses, the PHY Abstraction Layer treats them as such.
|
||||
In doing so, it has these goals:
|
||||
|
||||
1) Increase code-reuse
|
||||
2) Increase overall code-maintainability
|
||||
3) Speed development time for new network drivers, and for new systems
|
||||
|
||||
Basically, this layer is meant to provide an interface to PHY devices which
|
||||
allows network driver writers to write as little code as possible, while
|
||||
still providing a full feature set.
|
||||
|
||||
The MDIO bus
|
||||
|
||||
Most network devices are connected to a PHY by means of a management bus.
|
||||
Different devices use different busses (though some share common interfaces).
|
||||
In order to take advantage of the PAL, each bus interface needs to be
|
||||
registered as a distinct device.
|
||||
|
||||
1) read and write functions must be implemented. Their prototypes are:
|
||||
|
||||
int write(struct mii_bus *bus, int mii_id, int regnum, u16 value);
|
||||
int read(struct mii_bus *bus, int mii_id, int regnum);
|
||||
|
||||
mii_id is the address on the bus for the PHY, and regnum is the register
|
||||
number. These functions are guaranteed not to be called from interrupt
|
||||
time, so it is safe for them to block, waiting for an interrupt to signal
|
||||
the operation is complete
|
||||
|
||||
2) A reset function is optional. This is used to return the bus to an
|
||||
initialized state.
|
||||
|
||||
3) A probe function is needed. This function should set up anything the bus
|
||||
driver needs, setup the mii_bus structure, and register with the PAL using
|
||||
mdiobus_register. Similarly, there's a remove function to undo all of
|
||||
that (use mdiobus_unregister).
|
||||
|
||||
4) Like any driver, the device_driver structure must be configured, and init
|
||||
exit functions are used to register the driver.
|
||||
|
||||
5) The bus must also be declared somewhere as a device, and registered.
|
||||
|
||||
As an example for how one driver implemented an mdio bus driver, see
|
||||
drivers/net/ethernet/freescale/fsl_pq_mdio.c and an associated DTS file
|
||||
for one of the users. (e.g. "git grep fsl,.*-mdio arch/powerpc/boot/dts/")
|
||||
|
||||
Connecting to a PHY
|
||||
|
||||
Sometime during startup, the network driver needs to establish a connection
|
||||
between the PHY device, and the network device. At this time, the PHY's bus
|
||||
and drivers need to all have been loaded, so it is ready for the connection.
|
||||
At this point, there are several ways to connect to the PHY:
|
||||
|
||||
1) The PAL handles everything, and only calls the network driver when
|
||||
the link state changes, so it can react.
|
||||
|
||||
2) The PAL handles everything except interrupts (usually because the
|
||||
controller has the interrupt registers).
|
||||
|
||||
3) The PAL handles everything, but checks in with the driver every second,
|
||||
allowing the network driver to react first to any changes before the PAL
|
||||
does.
|
||||
|
||||
4) The PAL serves only as a library of functions, with the network device
|
||||
manually calling functions to update status, and configure the PHY
|
||||
|
||||
|
||||
Letting the PHY Abstraction Layer do Everything
|
||||
|
||||
If you choose option 1 (The hope is that every driver can, but to still be
|
||||
useful to drivers that can't), connecting to the PHY is simple:
|
||||
|
||||
First, you need a function to react to changes in the link state. This
|
||||
function follows this protocol:
|
||||
|
||||
static void adjust_link(struct net_device *dev);
|
||||
|
||||
Next, you need to know the device name of the PHY connected to this device.
|
||||
The name will look something like, "0:00", where the first number is the
|
||||
bus id, and the second is the PHY's address on that bus. Typically,
|
||||
the bus is responsible for making its ID unique.
|
||||
|
||||
Now, to connect, just call this function:
|
||||
|
||||
phydev = phy_connect(dev, phy_name, &adjust_link, interface);
|
||||
|
||||
phydev is a pointer to the phy_device structure which represents the PHY. If
|
||||
phy_connect is successful, it will return the pointer. dev, here, is the
|
||||
pointer to your net_device. Once done, this function will have started the
|
||||
PHY's software state machine, and registered for the PHY's interrupt, if it
|
||||
has one. The phydev structure will be populated with information about the
|
||||
current state, though the PHY will not yet be truly operational at this
|
||||
point.
|
||||
|
||||
PHY-specific flags should be set in phydev->dev_flags prior to the call
|
||||
to phy_connect() such that the underlying PHY driver can check for flags
|
||||
and perform specific operations based on them.
|
||||
This is useful if the system has put hardware restrictions on
|
||||
the PHY/controller, of which the PHY needs to be aware.
|
||||
|
||||
interface is a u32 which specifies the connection type used
|
||||
between the controller and the PHY. Examples are GMII, MII,
|
||||
RGMII, and SGMII. For a full list, see include/linux/phy.h
|
||||
|
||||
Now just make sure that phydev->supported and phydev->advertising have any
|
||||
values pruned from them which don't make sense for your controller (a 10/100
|
||||
controller may be connected to a gigabit capable PHY, so you would need to
|
||||
mask off SUPPORTED_1000baseT*). See include/linux/ethtool.h for definitions
|
||||
for these bitfields. Note that you should not SET any bits, or the PHY may
|
||||
get put into an unsupported state.
|
||||
|
||||
Lastly, once the controller is ready to handle network traffic, you call
|
||||
phy_start(phydev). This tells the PAL that you are ready, and configures the
|
||||
PHY to connect to the network. If you want to handle your own interrupts,
|
||||
just set phydev->irq to PHY_IGNORE_INTERRUPT before you call phy_start.
|
||||
Similarly, if you don't want to use interrupts, set phydev->irq to PHY_POLL.
|
||||
|
||||
When you want to disconnect from the network (even if just briefly), you call
|
||||
phy_stop(phydev).
|
||||
|
||||
Keeping Close Tabs on the PAL
|
||||
|
||||
It is possible that the PAL's built-in state machine needs a little help to
|
||||
keep your network device and the PHY properly in sync. If so, you can
|
||||
register a helper function when connecting to the PHY, which will be called
|
||||
every second before the state machine reacts to any changes. To do this, you
|
||||
need to manually call phy_attach() and phy_prepare_link(), and then call
|
||||
phy_start_machine() with the second argument set to point to your special
|
||||
handler.
|
||||
|
||||
Currently there are no examples of how to use this functionality, and testing
|
||||
on it has been limited because the author does not have any drivers which use
|
||||
it (they all use option 1). So Caveat Emptor.
|
||||
|
||||
Doing it all yourself
|
||||
|
||||
There's a remote chance that the PAL's built-in state machine cannot track
|
||||
the complex interactions between the PHY and your network device. If this is
|
||||
so, you can simply call phy_attach(), and not call phy_start_machine or
|
||||
phy_prepare_link(). This will mean that phydev->state is entirely yours to
|
||||
handle (phy_start and phy_stop toggle between some of the states, so you
|
||||
might need to avoid them).
|
||||
|
||||
An effort has been made to make sure that useful functionality can be
|
||||
accessed without the state-machine running, and most of these functions are
|
||||
descended from functions which did not interact with a complex state-machine.
|
||||
However, again, no effort has been made so far to test running without the
|
||||
state machine, so tryer beware.
|
||||
|
||||
Here is a brief rundown of the functions:
|
||||
|
||||
int phy_read(struct phy_device *phydev, u16 regnum);
|
||||
int phy_write(struct phy_device *phydev, u16 regnum, u16 val);
|
||||
|
||||
Simple read/write primitives. They invoke the bus's read/write function
|
||||
pointers.
|
||||
|
||||
void phy_print_status(struct phy_device *phydev);
|
||||
|
||||
A convenience function to print out the PHY status neatly.
|
||||
|
||||
int phy_start_interrupts(struct phy_device *phydev);
|
||||
int phy_stop_interrupts(struct phy_device *phydev);
|
||||
|
||||
Requests the IRQ for the PHY interrupts, then enables them for
|
||||
start, or disables then frees them for stop.
|
||||
|
||||
struct phy_device * phy_attach(struct net_device *dev, const char *phy_id,
|
||||
phy_interface_t interface);
|
||||
|
||||
Attaches a network device to a particular PHY, binding the PHY to a generic
|
||||
driver if none was found during bus initialization.
|
||||
|
||||
int phy_start_aneg(struct phy_device *phydev);
|
||||
|
||||
Using variables inside the phydev structure, either configures advertising
|
||||
and resets autonegotiation, or disables autonegotiation, and configures
|
||||
forced settings.
|
||||
|
||||
static inline int phy_read_status(struct phy_device *phydev);
|
||||
|
||||
Fills the phydev structure with up-to-date information about the current
|
||||
settings in the PHY.
|
||||
|
||||
int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd);
|
||||
int phy_ethtool_gset(struct phy_device *phydev, struct ethtool_cmd *cmd);
|
||||
|
||||
Ethtool convenience functions.
|
||||
|
||||
int phy_mii_ioctl(struct phy_device *phydev,
|
||||
struct mii_ioctl_data *mii_data, int cmd);
|
||||
|
||||
The MII ioctl. Note that this function will completely screw up the state
|
||||
machine if you write registers like BMCR, BMSR, ADVERTISE, etc. Best to
|
||||
use this only to write registers which are not standard, and don't set off
|
||||
a renegotiation.
|
||||
|
||||
|
||||
PHY Device Drivers
|
||||
|
||||
With the PHY Abstraction Layer, adding support for new PHYs is
|
||||
quite easy. In some cases, no work is required at all! However,
|
||||
many PHYs require a little hand-holding to get up-and-running.
|
||||
|
||||
Generic PHY driver
|
||||
|
||||
If the desired PHY doesn't have any errata, quirks, or special
|
||||
features you want to support, then it may be best to not add
|
||||
support, and let the PHY Abstraction Layer's Generic PHY Driver
|
||||
do all of the work.
|
||||
|
||||
Writing a PHY driver
|
||||
|
||||
If you do need to write a PHY driver, the first thing to do is
|
||||
make sure it can be matched with an appropriate PHY device.
|
||||
This is done during bus initialization by reading the device's
|
||||
UID (stored in registers 2 and 3), then comparing it to each
|
||||
driver's phy_id field by ANDing it with each driver's
|
||||
phy_id_mask field. Also, it needs a name. Here's an example:
|
||||
|
||||
static struct phy_driver dm9161_driver = {
|
||||
.phy_id = 0x0181b880,
|
||||
.name = "Davicom DM9161E",
|
||||
.phy_id_mask = 0x0ffffff0,
|
||||
...
|
||||
}
|
||||
|
||||
Next, you need to specify what features (speed, duplex, autoneg,
|
||||
etc) your PHY device and driver support. Most PHYs support
|
||||
PHY_BASIC_FEATURES, but you can look in include/mii.h for other
|
||||
features.
|
||||
|
||||
Each driver consists of a number of function pointers:
|
||||
|
||||
soft_reset: perform a PHY software reset
|
||||
config_init: configures PHY into a sane state after a reset.
|
||||
For instance, a Davicom PHY requires descrambling disabled.
|
||||
probe: Allocate phy->priv, optionally refuse to bind.
|
||||
PHY may not have been reset or had fixups run yet.
|
||||
suspend/resume: power management
|
||||
config_aneg: Changes the speed/duplex/negotiation settings
|
||||
aneg_done: Determines the auto-negotiation result
|
||||
read_status: Reads the current speed/duplex/negotiation settings
|
||||
ack_interrupt: Clear a pending interrupt
|
||||
did_interrupt: Checks if the PHY generated an interrupt
|
||||
config_intr: Enable or disable interrupts
|
||||
remove: Does any driver take-down
|
||||
ts_info: Queries about the HW timestamping status
|
||||
hwtstamp: Set the PHY HW timestamping configuration
|
||||
rxtstamp: Requests a receive timestamp at the PHY level for a 'skb'
|
||||
txtsamp: Requests a transmit timestamp at the PHY level for a 'skb'
|
||||
set_wol: Enable Wake-on-LAN at the PHY level
|
||||
get_wol: Get the Wake-on-LAN status at the PHY level
|
||||
read_mmd_indirect: Read PHY MMD indirect register
|
||||
write_mmd_indirect: Write PHY MMD indirect register
|
||||
|
||||
Of these, only config_aneg and read_status are required to be
|
||||
assigned by the driver code. The rest are optional. Also, it is
|
||||
preferred to use the generic phy driver's versions of these two
|
||||
functions if at all possible: genphy_read_status and
|
||||
genphy_config_aneg. If this is not possible, it is likely that
|
||||
you only need to perform some actions before and after invoking
|
||||
these functions, and so your functions will wrap the generic
|
||||
ones.
|
||||
|
||||
Feel free to look at the Marvell, Cicada, and Davicom drivers in
|
||||
drivers/net/phy/ for examples (the lxt and qsemi drivers have
|
||||
not been tested as of this writing).
|
||||
|
||||
The PHY's MMD register accesses are handled by the PAL framework
|
||||
by default, but can be overridden by a specific PHY driver if
|
||||
required. This could be the case if a PHY was released for
|
||||
manufacturing before the MMD PHY register definitions were
|
||||
standardized by the IEEE. Most modern PHYs will be able to use
|
||||
the generic PAL framework for accessing the PHY's MMD registers.
|
||||
An example of such usage is for Energy Efficient Ethernet support,
|
||||
implemented in the PAL. This support uses the PAL to access MMD
|
||||
registers for EEE query and configuration if the PHY supports
|
||||
the IEEE standard access mechanisms, or can use the PHY's specific
|
||||
access interfaces if overridden by the specific PHY driver. See
|
||||
the Micrel driver in drivers/net/phy/ for an example of how this
|
||||
can be implemented.
|
||||
|
||||
Board Fixups
|
||||
|
||||
Sometimes the specific interaction between the platform and the PHY requires
|
||||
special handling. For instance, to change where the PHY's clock input is,
|
||||
or to add a delay to account for latency issues in the data path. In order
|
||||
to support such contingencies, the PHY Layer allows platform code to register
|
||||
fixups to be run when the PHY is brought up (or subsequently reset).
|
||||
|
||||
When the PHY Layer brings up a PHY it checks to see if there are any fixups
|
||||
registered for it, matching based on UID (contained in the PHY device's phy_id
|
||||
field) and the bus identifier (contained in phydev->dev.bus_id). Both must
|
||||
match, however two constants, PHY_ANY_ID and PHY_ANY_UID, are provided as
|
||||
wildcards for the bus ID and UID, respectively.
|
||||
|
||||
When a match is found, the PHY layer will invoke the run function associated
|
||||
with the fixup. This function is passed a pointer to the phy_device of
|
||||
interest. It should therefore only operate on that PHY.
|
||||
|
||||
The platform code can either register the fixup using phy_register_fixup():
|
||||
|
||||
int phy_register_fixup(const char *phy_id,
|
||||
u32 phy_uid, u32 phy_uid_mask,
|
||||
int (*run)(struct phy_device *));
|
||||
|
||||
Or using one of the two stubs, phy_register_fixup_for_uid() and
|
||||
phy_register_fixup_for_id():
|
||||
|
||||
int phy_register_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask,
|
||||
int (*run)(struct phy_device *));
|
||||
int phy_register_fixup_for_id(const char *phy_id,
|
||||
int (*run)(struct phy_device *));
|
||||
|
||||
The stubs set one of the two matching criteria, and set the other one to
|
||||
match anything.
|
||||
|
||||
321
Documentation/networking/pktgen.txt
Normal file
321
Documentation/networking/pktgen.txt
Normal file
|
|
@ -0,0 +1,321 @@
|
|||
|
||||
|
||||
HOWTO for the linux packet generator
|
||||
------------------------------------
|
||||
|
||||
Date: 041221
|
||||
|
||||
Enable CONFIG_NET_PKTGEN to compile and build pktgen.o either in kernel
|
||||
or as module. Module is preferred. insmod pktgen if needed. Once running
|
||||
pktgen creates a thread on each CPU where each thread has affinity to its CPU.
|
||||
Monitoring and controlling is done via /proc. Easiest to select a suitable
|
||||
a sample script and configure.
|
||||
|
||||
On a dual CPU:
|
||||
|
||||
ps aux | grep pkt
|
||||
root 129 0.3 0.0 0 0 ? SW 2003 523:20 [pktgen/0]
|
||||
root 130 0.3 0.0 0 0 ? SW 2003 509:50 [pktgen/1]
|
||||
|
||||
|
||||
For monitoring and control pktgen creates:
|
||||
/proc/net/pktgen/pgctrl
|
||||
/proc/net/pktgen/kpktgend_X
|
||||
/proc/net/pktgen/ethX
|
||||
|
||||
|
||||
Tuning NIC for max performance
|
||||
==============================
|
||||
|
||||
The default NIC setting are (likely) not tuned for pktgen's artificial
|
||||
overload type of benchmarking, as this could hurt the normal use-case.
|
||||
|
||||
Specifically increasing the TX ring buffer in the NIC:
|
||||
# ethtool -G ethX tx 1024
|
||||
|
||||
A larger TX ring can improve pktgen's performance, while it can hurt
|
||||
in the general case, 1) because the TX ring buffer might get larger
|
||||
than the CPUs L1/L2 cache, 2) because it allow more queueing in the
|
||||
NIC HW layer (which is bad for bufferbloat).
|
||||
|
||||
One should be careful to conclude, that packets/descriptors in the HW
|
||||
TX ring cause delay. Drivers usually delay cleaning up the
|
||||
ring-buffers (for various performance reasons), thus packets stalling
|
||||
the TX ring, might just be waiting for cleanup.
|
||||
|
||||
This cleanup issues is specifically the case, for the driver ixgbe
|
||||
(Intel 82599 chip). This driver (ixgbe) combine TX+RX ring cleanups,
|
||||
and the cleanup interval is affected by the ethtool --coalesce setting
|
||||
of parameter "rx-usecs".
|
||||
|
||||
For ixgbe use e.g "30" resulting in approx 33K interrupts/sec (1/30*10^6):
|
||||
# ethtool -C ethX rx-usecs 30
|
||||
|
||||
|
||||
Viewing threads
|
||||
===============
|
||||
/proc/net/pktgen/kpktgend_0
|
||||
Name: kpktgend_0 max_before_softirq: 10000
|
||||
Running:
|
||||
Stopped: eth1
|
||||
Result: OK: max_before_softirq=10000
|
||||
|
||||
Most important the devices assigned to thread. Note! A device can only belong
|
||||
to one thread.
|
||||
|
||||
|
||||
Viewing devices
|
||||
===============
|
||||
|
||||
Parm section holds configured info. Current hold running stats.
|
||||
Result is printed after run or after interruption. Example:
|
||||
|
||||
/proc/net/pktgen/eth1
|
||||
|
||||
Params: count 10000000 min_pkt_size: 60 max_pkt_size: 60
|
||||
frags: 0 delay: 0 clone_skb: 1000000 ifname: eth1
|
||||
flows: 0 flowlen: 0
|
||||
dst_min: 10.10.11.2 dst_max:
|
||||
src_min: src_max:
|
||||
src_mac: 00:00:00:00:00:00 dst_mac: 00:04:23:AC:FD:82
|
||||
udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9
|
||||
src_mac_count: 0 dst_mac_count: 0
|
||||
Flags:
|
||||
Current:
|
||||
pkts-sofar: 10000000 errors: 39664
|
||||
started: 1103053986245187us stopped: 1103053999346329us idle: 880401us
|
||||
seq_num: 10000011 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
|
||||
cur_saddr: 0x10a0a0a cur_daddr: 0x20b0a0a
|
||||
cur_udp_dst: 9 cur_udp_src: 9
|
||||
flows: 0
|
||||
Result: OK: 13101142(c12220741+d880401) usec, 10000000 (60byte,0frags)
|
||||
763292pps 390Mb/sec (390805504bps) errors: 39664
|
||||
|
||||
Configuring threads and devices
|
||||
================================
|
||||
This is done via the /proc interface easiest done via pgset in the scripts
|
||||
|
||||
Examples:
|
||||
|
||||
pgset "clone_skb 1" sets the number of copies of the same packet
|
||||
pgset "clone_skb 0" use single SKB for all transmits
|
||||
pgset "burst 8" uses xmit_more API to queue 8 copies of the same
|
||||
packet and update HW tx queue tail pointer once.
|
||||
"burst 1" is the default
|
||||
pgset "pkt_size 9014" sets packet size to 9014
|
||||
pgset "frags 5" packet will consist of 5 fragments
|
||||
pgset "count 200000" sets number of packets to send, set to zero
|
||||
for continuous sends until explicitly stopped.
|
||||
|
||||
pgset "delay 5000" adds delay to hard_start_xmit(). nanoseconds
|
||||
|
||||
pgset "dst 10.0.0.1" sets IP destination address
|
||||
(BEWARE! This generator is very aggressive!)
|
||||
|
||||
pgset "dst_min 10.0.0.1" Same as dst
|
||||
pgset "dst_max 10.0.0.254" Set the maximum destination IP.
|
||||
pgset "src_min 10.0.0.1" Set the minimum (or only) source IP.
|
||||
pgset "src_max 10.0.0.254" Set the maximum source IP.
|
||||
pgset "dst6 fec0::1" IPV6 destination address
|
||||
pgset "src6 fec0::2" IPV6 source address
|
||||
pgset "dstmac 00:00:00:00:00:00" sets MAC destination address
|
||||
pgset "srcmac 00:00:00:00:00:00" sets MAC source address
|
||||
|
||||
pgset "queue_map_min 0" Sets the min value of tx queue interval
|
||||
pgset "queue_map_max 7" Sets the max value of tx queue interval, for multiqueue devices
|
||||
To select queue 1 of a given device,
|
||||
use queue_map_min=1 and queue_map_max=1
|
||||
|
||||
pgset "src_mac_count 1" Sets the number of MACs we'll range through.
|
||||
The 'minimum' MAC is what you set with srcmac.
|
||||
|
||||
pgset "dst_mac_count 1" Sets the number of MACs we'll range through.
|
||||
The 'minimum' MAC is what you set with dstmac.
|
||||
|
||||
pgset "flag [name]" Set a flag to determine behaviour. Current flags
|
||||
are: IPSRC_RND # IP source is random (between min/max)
|
||||
IPDST_RND # IP destination is random
|
||||
UDPSRC_RND, UDPDST_RND,
|
||||
MACSRC_RND, MACDST_RND
|
||||
TXSIZE_RND, IPV6,
|
||||
MPLS_RND, VID_RND, SVID_RND
|
||||
FLOW_SEQ,
|
||||
QUEUE_MAP_RND # queue map random
|
||||
QUEUE_MAP_CPU # queue map mirrors smp_processor_id()
|
||||
UDPCSUM,
|
||||
IPSEC # IPsec encapsulation (needs CONFIG_XFRM)
|
||||
NODE_ALLOC # node specific memory allocation
|
||||
|
||||
pgset spi SPI_VALUE Set specific SA used to transform packet.
|
||||
|
||||
pgset "udp_src_min 9" set UDP source port min, If < udp_src_max, then
|
||||
cycle through the port range.
|
||||
|
||||
pgset "udp_src_max 9" set UDP source port max.
|
||||
pgset "udp_dst_min 9" set UDP destination port min, If < udp_dst_max, then
|
||||
cycle through the port range.
|
||||
pgset "udp_dst_max 9" set UDP destination port max.
|
||||
|
||||
pgset "mpls 0001000a,0002000a,0000000a" set MPLS labels (in this example
|
||||
outer label=16,middle label=32,
|
||||
inner label=0 (IPv4 NULL)) Note that
|
||||
there must be no spaces between the
|
||||
arguments. Leading zeros are required.
|
||||
Do not set the bottom of stack bit,
|
||||
that's done automatically. If you do
|
||||
set the bottom of stack bit, that
|
||||
indicates that you want to randomly
|
||||
generate that address and the flag
|
||||
MPLS_RND will be turned on. You
|
||||
can have any mix of random and fixed
|
||||
labels in the label stack.
|
||||
|
||||
pgset "mpls 0" turn off mpls (or any invalid argument works too!)
|
||||
|
||||
pgset "vlan_id 77" set VLAN ID 0-4095
|
||||
pgset "vlan_p 3" set priority bit 0-7 (default 0)
|
||||
pgset "vlan_cfi 0" set canonical format identifier 0-1 (default 0)
|
||||
|
||||
pgset "svlan_id 22" set SVLAN ID 0-4095
|
||||
pgset "svlan_p 3" set priority bit 0-7 (default 0)
|
||||
pgset "svlan_cfi 0" set canonical format identifier 0-1 (default 0)
|
||||
|
||||
pgset "vlan_id 9999" > 4095 remove vlan and svlan tags
|
||||
pgset "svlan 9999" > 4095 remove svlan tag
|
||||
|
||||
|
||||
pgset "tos XX" set former IPv4 TOS field (e.g. "tos 28" for AF11 no ECN, default 00)
|
||||
pgset "traffic_class XX" set former IPv6 TRAFFIC CLASS (e.g. "traffic_class B8" for EF no ECN, default 00)
|
||||
|
||||
pgset stop aborts injection. Also, ^C aborts generator.
|
||||
|
||||
pgset "rate 300M" set rate to 300 Mb/s
|
||||
pgset "ratep 1000000" set rate to 1Mpps
|
||||
|
||||
Example scripts
|
||||
===============
|
||||
|
||||
A collection of small tutorial scripts for pktgen is in examples dir.
|
||||
|
||||
pktgen.conf-1-1 # 1 CPU 1 dev
|
||||
pktgen.conf-1-2 # 1 CPU 2 dev
|
||||
pktgen.conf-2-1 # 2 CPU's 1 dev
|
||||
pktgen.conf-2-2 # 2 CPU's 2 dev
|
||||
pktgen.conf-1-1-rdos # 1 CPU 1 dev w. route DoS
|
||||
pktgen.conf-1-1-ip6 # 1 CPU 1 dev ipv6
|
||||
pktgen.conf-1-1-ip6-rdos # 1 CPU 1 dev ipv6 w. route DoS
|
||||
pktgen.conf-1-1-flows # 1 CPU 1 dev multiple flows.
|
||||
|
||||
Run in shell: ./pktgen.conf-X-Y It does all the setup including sending.
|
||||
|
||||
|
||||
Interrupt affinity
|
||||
===================
|
||||
Note when adding devices to a specific CPU there good idea to also assign
|
||||
/proc/irq/XX/smp_affinity so the TX-interrupts gets bound to the same CPU.
|
||||
as this reduces cache bouncing when freeing skb's.
|
||||
|
||||
Enable IPsec
|
||||
============
|
||||
Default IPsec transformation with ESP encapsulation plus Transport mode
|
||||
could be enabled by simply setting:
|
||||
|
||||
pgset "flag IPSEC"
|
||||
pgset "flows 1"
|
||||
|
||||
To avoid breaking existing testbed scripts for using AH type and tunnel mode,
|
||||
user could use "pgset spi SPI_VALUE" to specify which formal of transformation
|
||||
to employ.
|
||||
|
||||
|
||||
Current commands and configuration options
|
||||
==========================================
|
||||
|
||||
** Pgcontrol commands:
|
||||
|
||||
start
|
||||
stop
|
||||
|
||||
** Thread commands:
|
||||
|
||||
add_device
|
||||
rem_device_all
|
||||
max_before_softirq
|
||||
|
||||
|
||||
** Device commands:
|
||||
|
||||
count
|
||||
clone_skb
|
||||
debug
|
||||
|
||||
frags
|
||||
delay
|
||||
|
||||
src_mac_count
|
||||
dst_mac_count
|
||||
|
||||
pkt_size
|
||||
min_pkt_size
|
||||
max_pkt_size
|
||||
|
||||
mpls
|
||||
|
||||
udp_src_min
|
||||
udp_src_max
|
||||
|
||||
udp_dst_min
|
||||
udp_dst_max
|
||||
|
||||
flag
|
||||
IPSRC_RND
|
||||
IPDST_RND
|
||||
UDPSRC_RND
|
||||
UDPDST_RND
|
||||
MACSRC_RND
|
||||
MACDST_RND
|
||||
TXSIZE_RND
|
||||
IPV6
|
||||
MPLS_RND
|
||||
VID_RND
|
||||
SVID_RND
|
||||
FLOW_SEQ
|
||||
QUEUE_MAP_RND
|
||||
QUEUE_MAP_CPU
|
||||
UDPCSUM
|
||||
IPSEC
|
||||
NODE_ALLOC
|
||||
|
||||
dst_min
|
||||
dst_max
|
||||
|
||||
src_min
|
||||
src_max
|
||||
|
||||
dst_mac
|
||||
src_mac
|
||||
|
||||
clear_counters
|
||||
|
||||
dst6
|
||||
src6
|
||||
|
||||
flows
|
||||
flowlen
|
||||
|
||||
rate
|
||||
ratep
|
||||
|
||||
References:
|
||||
ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/
|
||||
ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/
|
||||
|
||||
Paper from Linux-Kongress in Erlangen 2004.
|
||||
ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/pktgen_paper.pdf
|
||||
|
||||
Thanks to:
|
||||
Grant Grundler for testing on IA-64 and parisc, Harald Welte, Lennert Buytenhek
|
||||
Stephen Hemminger, Andi Kleen, Dave Miller and many others.
|
||||
|
||||
|
||||
Good luck with the linux net-development.
|
||||
150
Documentation/networking/policy-routing.txt
Normal file
150
Documentation/networking/policy-routing.txt
Normal file
|
|
@ -0,0 +1,150 @@
|
|||
Classes
|
||||
-------
|
||||
|
||||
"Class" is a complete routing table in common sense.
|
||||
I.e. it is tree of nodes (destination prefix, tos, metric)
|
||||
with attached information: gateway, device etc.
|
||||
This tree is looked up as specified in RFC1812 5.2.4.3
|
||||
1. Basic match
|
||||
2. Longest match
|
||||
3. Weak TOS.
|
||||
4. Metric. (should not be in kernel space, but they are)
|
||||
5. Additional pruning rules. (not in kernel space).
|
||||
|
||||
We have two special type of nodes:
|
||||
REJECT - abort route lookup and return an error value.
|
||||
THROW - abort route lookup in this class.
|
||||
|
||||
|
||||
Currently the number of classes is limited to 255
|
||||
(0 is reserved for "not specified class")
|
||||
|
||||
Three classes are builtin:
|
||||
|
||||
RT_CLASS_LOCAL=255 - local interface addresses,
|
||||
broadcasts, nat addresses.
|
||||
|
||||
RT_CLASS_MAIN=254 - all normal routes are put there
|
||||
by default.
|
||||
|
||||
RT_CLASS_DEFAULT=253 - if ip_fib_model==1, then
|
||||
normal default routes are put there, if ip_fib_model==2
|
||||
all gateway routes are put there.
|
||||
|
||||
|
||||
Rules
|
||||
-----
|
||||
Rule is a record of (src prefix, src interface, tos, dst prefix)
|
||||
with attached information.
|
||||
|
||||
Rule types:
|
||||
RTP_ROUTE - lookup in attached class
|
||||
RTP_NAT - lookup in attached class and if a match is found,
|
||||
translate packet source address.
|
||||
RTP_MASQUERADE - lookup in attached class and if a match is found,
|
||||
masquerade packet as sourced by us.
|
||||
RTP_DROP - silently drop the packet.
|
||||
RTP_REJECT - drop the packet and send ICMP NET UNREACHABLE.
|
||||
RTP_PROHIBIT - drop the packet and send ICMP COMM. ADM. PROHIBITED.
|
||||
|
||||
Rule flags:
|
||||
RTRF_LOG - log route creations.
|
||||
RTRF_VALVE - One way route (used with masquerading)
|
||||
|
||||
Default setup:
|
||||
|
||||
root@amber:/pub/ip-routing # iproute -r
|
||||
Kernel routing policy rules
|
||||
Pref Source Destination TOS Iface Cl
|
||||
0 default default 00 * 255
|
||||
254 default default 00 * 254
|
||||
255 default default 00 * 253
|
||||
|
||||
|
||||
Lookup algorithm
|
||||
----------------
|
||||
|
||||
We scan rules list, and if a rule is matched, apply it.
|
||||
If a route is found, return it.
|
||||
If it is not found or a THROW node was matched, continue
|
||||
to scan rules.
|
||||
|
||||
Applications
|
||||
------------
|
||||
|
||||
1. Just ignore classes. All the routes are put into MAIN class
|
||||
(and/or into DEFAULT class).
|
||||
|
||||
HOWTO: iproute add PREFIX [ tos TOS ] [ gw GW ] [ dev DEV ]
|
||||
[ metric METRIC ] [ reject ] ... (look at iproute utility)
|
||||
|
||||
or use route utility from current net-tools.
|
||||
|
||||
2. Opposite case. Just forget all that you know about routing
|
||||
tables. Every rule is supplied with its own gateway, device
|
||||
info. record. This approach is not appropriate for automated
|
||||
route maintenance, but it is ideal for manual configuration.
|
||||
|
||||
HOWTO: iproute addrule [ from PREFIX ] [ to PREFIX ] [ tos TOS ]
|
||||
[ dev INPUTDEV] [ pref PREFERENCE ] route [ gw GATEWAY ]
|
||||
[ dev OUTDEV ] .....
|
||||
|
||||
Warning: As of now the size of the routing table in this
|
||||
approach is limited to 256. If someone likes this model, I'll
|
||||
relax this limitation.
|
||||
|
||||
3. OSPF classes (see RFC1583, RFC1812 E.3.3)
|
||||
Very clean, stable and robust algorithm for OSPF routing
|
||||
domains. Unfortunately, it is not widely used in the Internet.
|
||||
|
||||
Proposed setup:
|
||||
255 local addresses
|
||||
254 interface routes
|
||||
253 ASE routes with external metric
|
||||
252 ASE routes with internal metric
|
||||
251 inter-area routes
|
||||
250 intra-area routes for 1st area
|
||||
249 intra-area routes for 2nd area
|
||||
etc.
|
||||
|
||||
Rules:
|
||||
iproute addrule class 253
|
||||
iproute addrule class 252
|
||||
iproute addrule class 251
|
||||
iproute addrule to a-prefix-for-1st-area class 250
|
||||
iproute addrule to another-prefix-for-1st-area class 250
|
||||
...
|
||||
iproute addrule to a-prefix-for-2nd-area class 249
|
||||
...
|
||||
|
||||
Area classes must be terminated with reject record.
|
||||
iproute add default reject class 250
|
||||
iproute add default reject class 249
|
||||
...
|
||||
|
||||
4. The Variant Router Requirements Algorithm (RFC1812 E.3.2)
|
||||
Create 16 classes for different TOS values.
|
||||
It is a funny, but pretty useless algorithm.
|
||||
I listed it just to show the power of new routing code.
|
||||
|
||||
5. All the variety of combinations......
|
||||
|
||||
|
||||
GATED
|
||||
-----
|
||||
|
||||
Gated does not understand classes, but it will work
|
||||
happily in MAIN+DEFAULT. All policy routes can be set
|
||||
and maintained manually.
|
||||
|
||||
IMPORTANT NOTE
|
||||
--------------
|
||||
route.c has a compilation time switch CONFIG_IP_LOCAL_RT_POLICY.
|
||||
If it is set, locally originated packets are routed
|
||||
using all the policy list. This is not very convenient and
|
||||
pretty ambiguous when used with NAT and masquerading.
|
||||
I set it to FALSE by default.
|
||||
|
||||
|
||||
Alexey Kuznetov
|
||||
kuznet@ms2.inr.ac.ru
|
||||
432
Documentation/networking/ppp_generic.txt
Normal file
432
Documentation/networking/ppp_generic.txt
Normal file
|
|
@ -0,0 +1,432 @@
|
|||
PPP Generic Driver and Channel Interface
|
||||
----------------------------------------
|
||||
|
||||
Paul Mackerras
|
||||
paulus@samba.org
|
||||
7 Feb 2002
|
||||
|
||||
The generic PPP driver in linux-2.4 provides an implementation of the
|
||||
functionality which is of use in any PPP implementation, including:
|
||||
|
||||
* the network interface unit (ppp0 etc.)
|
||||
* the interface to the networking code
|
||||
* PPP multilink: splitting datagrams between multiple links, and
|
||||
ordering and combining received fragments
|
||||
* the interface to pppd, via a /dev/ppp character device
|
||||
* packet compression and decompression
|
||||
* TCP/IP header compression and decompression
|
||||
* detecting network traffic for demand dialling and for idle timeouts
|
||||
* simple packet filtering
|
||||
|
||||
For sending and receiving PPP frames, the generic PPP driver calls on
|
||||
the services of PPP `channels'. A PPP channel encapsulates a
|
||||
mechanism for transporting PPP frames from one machine to another. A
|
||||
PPP channel implementation can be arbitrarily complex internally but
|
||||
has a very simple interface with the generic PPP code: it merely has
|
||||
to be able to send PPP frames, receive PPP frames, and optionally
|
||||
handle ioctl requests. Currently there are PPP channel
|
||||
implementations for asynchronous serial ports, synchronous serial
|
||||
ports, and for PPP over ethernet.
|
||||
|
||||
This architecture makes it possible to implement PPP multilink in a
|
||||
natural and straightforward way, by allowing more than one channel to
|
||||
be linked to each ppp network interface unit. The generic layer is
|
||||
responsible for splitting datagrams on transmit and recombining them
|
||||
on receive.
|
||||
|
||||
|
||||
PPP channel API
|
||||
---------------
|
||||
|
||||
See include/linux/ppp_channel.h for the declaration of the types and
|
||||
functions used to communicate between the generic PPP layer and PPP
|
||||
channels.
|
||||
|
||||
Each channel has to provide two functions to the generic PPP layer,
|
||||
via the ppp_channel.ops pointer:
|
||||
|
||||
* start_xmit() is called by the generic layer when it has a frame to
|
||||
send. The channel has the option of rejecting the frame for
|
||||
flow-control reasons. In this case, start_xmit() should return 0
|
||||
and the channel should call the ppp_output_wakeup() function at a
|
||||
later time when it can accept frames again, and the generic layer
|
||||
will then attempt to retransmit the rejected frame(s). If the frame
|
||||
is accepted, the start_xmit() function should return 1.
|
||||
|
||||
* ioctl() provides an interface which can be used by a user-space
|
||||
program to control aspects of the channel's behaviour. This
|
||||
procedure will be called when a user-space program does an ioctl
|
||||
system call on an instance of /dev/ppp which is bound to the
|
||||
channel. (Usually it would only be pppd which would do this.)
|
||||
|
||||
The generic PPP layer provides seven functions to channels:
|
||||
|
||||
* ppp_register_channel() is called when a channel has been created, to
|
||||
notify the PPP generic layer of its presence. For example, setting
|
||||
a serial port to the PPPDISC line discipline causes the ppp_async
|
||||
channel code to call this function.
|
||||
|
||||
* ppp_unregister_channel() is called when a channel is to be
|
||||
destroyed. For example, the ppp_async channel code calls this when
|
||||
a hangup is detected on the serial port.
|
||||
|
||||
* ppp_output_wakeup() is called by a channel when it has previously
|
||||
rejected a call to its start_xmit function, and can now accept more
|
||||
packets.
|
||||
|
||||
* ppp_input() is called by a channel when it has received a complete
|
||||
PPP frame.
|
||||
|
||||
* ppp_input_error() is called by a channel when it has detected that a
|
||||
frame has been lost or dropped (for example, because of a FCS (frame
|
||||
check sequence) error).
|
||||
|
||||
* ppp_channel_index() returns the channel index assigned by the PPP
|
||||
generic layer to this channel. The channel should provide some way
|
||||
(e.g. an ioctl) to transmit this back to user-space, as user-space
|
||||
will need it to attach an instance of /dev/ppp to this channel.
|
||||
|
||||
* ppp_unit_number() returns the unit number of the ppp network
|
||||
interface to which this channel is connected, or -1 if the channel
|
||||
is not connected.
|
||||
|
||||
Connecting a channel to the ppp generic layer is initiated from the
|
||||
channel code, rather than from the generic layer. The channel is
|
||||
expected to have some way for a user-level process to control it
|
||||
independently of the ppp generic layer. For example, with the
|
||||
ppp_async channel, this is provided by the file descriptor to the
|
||||
serial port.
|
||||
|
||||
Generally a user-level process will initialize the underlying
|
||||
communications medium and prepare it to do PPP. For example, with an
|
||||
async tty, this can involve setting the tty speed and modes, issuing
|
||||
modem commands, and then going through some sort of dialog with the
|
||||
remote system to invoke PPP service there. We refer to this process
|
||||
as `discovery'. Then the user-level process tells the medium to
|
||||
become a PPP channel and register itself with the generic PPP layer.
|
||||
The channel then has to report the channel number assigned to it back
|
||||
to the user-level process. From that point, the PPP negotiation code
|
||||
in the PPP daemon (pppd) can take over and perform the PPP
|
||||
negotiation, accessing the channel through the /dev/ppp interface.
|
||||
|
||||
At the interface to the PPP generic layer, PPP frames are stored in
|
||||
skbuff structures and start with the two-byte PPP protocol number.
|
||||
The frame does *not* include the 0xff `address' byte or the 0x03
|
||||
`control' byte that are optionally used in async PPP. Nor is there
|
||||
any escaping of control characters, nor are there any FCS or framing
|
||||
characters included. That is all the responsibility of the channel
|
||||
code, if it is needed for the particular medium. That is, the skbuffs
|
||||
presented to the start_xmit() function contain only the 2-byte
|
||||
protocol number and the data, and the skbuffs presented to ppp_input()
|
||||
must be in the same format.
|
||||
|
||||
The channel must provide an instance of a ppp_channel struct to
|
||||
represent the channel. The channel is free to use the `private' field
|
||||
however it wishes. The channel should initialize the `mtu' and
|
||||
`hdrlen' fields before calling ppp_register_channel() and not change
|
||||
them until after ppp_unregister_channel() returns. The `mtu' field
|
||||
represents the maximum size of the data part of the PPP frames, that
|
||||
is, it does not include the 2-byte protocol number.
|
||||
|
||||
If the channel needs some headroom in the skbuffs presented to it for
|
||||
transmission (i.e., some space free in the skbuff data area before the
|
||||
start of the PPP frame), it should set the `hdrlen' field of the
|
||||
ppp_channel struct to the amount of headroom required. The generic
|
||||
PPP layer will attempt to provide that much headroom but the channel
|
||||
should still check if there is sufficient headroom and copy the skbuff
|
||||
if there isn't.
|
||||
|
||||
On the input side, channels should ideally provide at least 2 bytes of
|
||||
headroom in the skbuffs presented to ppp_input(). The generic PPP
|
||||
code does not require this but will be more efficient if this is done.
|
||||
|
||||
|
||||
Buffering and flow control
|
||||
--------------------------
|
||||
|
||||
The generic PPP layer has been designed to minimize the amount of data
|
||||
that it buffers in the transmit direction. It maintains a queue of
|
||||
transmit packets for the PPP unit (network interface device) plus a
|
||||
queue of transmit packets for each attached channel. Normally the
|
||||
transmit queue for the unit will contain at most one packet; the
|
||||
exceptions are when pppd sends packets by writing to /dev/ppp, and
|
||||
when the core networking code calls the generic layer's start_xmit()
|
||||
function with the queue stopped, i.e. when the generic layer has
|
||||
called netif_stop_queue(), which only happens on a transmit timeout.
|
||||
The start_xmit function always accepts and queues the packet which it
|
||||
is asked to transmit.
|
||||
|
||||
Transmit packets are dequeued from the PPP unit transmit queue and
|
||||
then subjected to TCP/IP header compression and packet compression
|
||||
(Deflate or BSD-Compress compression), as appropriate. After this
|
||||
point the packets can no longer be reordered, as the decompression
|
||||
algorithms rely on receiving compressed packets in the same order that
|
||||
they were generated.
|
||||
|
||||
If multilink is not in use, this packet is then passed to the attached
|
||||
channel's start_xmit() function. If the channel refuses to take
|
||||
the packet, the generic layer saves it for later transmission. The
|
||||
generic layer will call the channel's start_xmit() function again
|
||||
when the channel calls ppp_output_wakeup() or when the core
|
||||
networking code calls the generic layer's start_xmit() function
|
||||
again. The generic layer contains no timeout and retransmission
|
||||
logic; it relies on the core networking code for that.
|
||||
|
||||
If multilink is in use, the generic layer divides the packet into one
|
||||
or more fragments and puts a multilink header on each fragment. It
|
||||
decides how many fragments to use based on the length of the packet
|
||||
and the number of channels which are potentially able to accept a
|
||||
fragment at the moment. A channel is potentially able to accept a
|
||||
fragment if it doesn't have any fragments currently queued up for it
|
||||
to transmit. The channel may still refuse a fragment; in this case
|
||||
the fragment is queued up for the channel to transmit later. This
|
||||
scheme has the effect that more fragments are given to higher-
|
||||
bandwidth channels. It also means that under light load, the generic
|
||||
layer will tend to fragment large packets across all the channels,
|
||||
thus reducing latency, while under heavy load, packets will tend to be
|
||||
transmitted as single fragments, thus reducing the overhead of
|
||||
fragmentation.
|
||||
|
||||
|
||||
SMP safety
|
||||
----------
|
||||
|
||||
The PPP generic layer has been designed to be SMP-safe. Locks are
|
||||
used around accesses to the internal data structures where necessary
|
||||
to ensure their integrity. As part of this, the generic layer
|
||||
requires that the channels adhere to certain requirements and in turn
|
||||
provides certain guarantees to the channels. Essentially the channels
|
||||
are required to provide the appropriate locking on the ppp_channel
|
||||
structures that form the basis of the communication between the
|
||||
channel and the generic layer. This is because the channel provides
|
||||
the storage for the ppp_channel structure, and so the channel is
|
||||
required to provide the guarantee that this storage exists and is
|
||||
valid at the appropriate times.
|
||||
|
||||
The generic layer requires these guarantees from the channel:
|
||||
|
||||
* The ppp_channel object must exist from the time that
|
||||
ppp_register_channel() is called until after the call to
|
||||
ppp_unregister_channel() returns.
|
||||
|
||||
* No thread may be in a call to any of ppp_input(), ppp_input_error(),
|
||||
ppp_output_wakeup(), ppp_channel_index() or ppp_unit_number() for a
|
||||
channel at the time that ppp_unregister_channel() is called for that
|
||||
channel.
|
||||
|
||||
* ppp_register_channel() and ppp_unregister_channel() must be called
|
||||
from process context, not interrupt or softirq/BH context.
|
||||
|
||||
* The remaining generic layer functions may be called at softirq/BH
|
||||
level but must not be called from a hardware interrupt handler.
|
||||
|
||||
* The generic layer may call the channel start_xmit() function at
|
||||
softirq/BH level but will not call it at interrupt level. Thus the
|
||||
start_xmit() function may not block.
|
||||
|
||||
* The generic layer will only call the channel ioctl() function in
|
||||
process context.
|
||||
|
||||
The generic layer provides these guarantees to the channels:
|
||||
|
||||
* The generic layer will not call the start_xmit() function for a
|
||||
channel while any thread is already executing in that function for
|
||||
that channel.
|
||||
|
||||
* The generic layer will not call the ioctl() function for a channel
|
||||
while any thread is already executing in that function for that
|
||||
channel.
|
||||
|
||||
* By the time a call to ppp_unregister_channel() returns, no thread
|
||||
will be executing in a call from the generic layer to that channel's
|
||||
start_xmit() or ioctl() function, and the generic layer will not
|
||||
call either of those functions subsequently.
|
||||
|
||||
|
||||
Interface to pppd
|
||||
-----------------
|
||||
|
||||
The PPP generic layer exports a character device interface called
|
||||
/dev/ppp. This is used by pppd to control PPP interface units and
|
||||
channels. Although there is only one /dev/ppp, each open instance of
|
||||
/dev/ppp acts independently and can be attached either to a PPP unit
|
||||
or a PPP channel. This is achieved using the file->private_data field
|
||||
to point to a separate object for each open instance of /dev/ppp. In
|
||||
this way an effect similar to Solaris' clone open is obtained,
|
||||
allowing us to control an arbitrary number of PPP interfaces and
|
||||
channels without having to fill up /dev with hundreds of device names.
|
||||
|
||||
When /dev/ppp is opened, a new instance is created which is initially
|
||||
unattached. Using an ioctl call, it can then be attached to an
|
||||
existing unit, attached to a newly-created unit, or attached to an
|
||||
existing channel. An instance attached to a unit can be used to send
|
||||
and receive PPP control frames, using the read() and write() system
|
||||
calls, along with poll() if necessary. Similarly, an instance
|
||||
attached to a channel can be used to send and receive PPP frames on
|
||||
that channel.
|
||||
|
||||
In multilink terms, the unit represents the bundle, while the channels
|
||||
represent the individual physical links. Thus, a PPP frame sent by a
|
||||
write to the unit (i.e., to an instance of /dev/ppp attached to the
|
||||
unit) will be subject to bundle-level compression and to fragmentation
|
||||
across the individual links (if multilink is in use). In contrast, a
|
||||
PPP frame sent by a write to the channel will be sent as-is on that
|
||||
channel, without any multilink header.
|
||||
|
||||
A channel is not initially attached to any unit. In this state it can
|
||||
be used for PPP negotiation but not for the transfer of data packets.
|
||||
It can then be connected to a PPP unit with an ioctl call, which
|
||||
makes it available to send and receive data packets for that unit.
|
||||
|
||||
The ioctl calls which are available on an instance of /dev/ppp depend
|
||||
on whether it is unattached, attached to a PPP interface, or attached
|
||||
to a PPP channel. The ioctl calls which are available on an
|
||||
unattached instance are:
|
||||
|
||||
* PPPIOCNEWUNIT creates a new PPP interface and makes this /dev/ppp
|
||||
instance the "owner" of the interface. The argument should point to
|
||||
an int which is the desired unit number if >= 0, or -1 to assign the
|
||||
lowest unused unit number. Being the owner of the interface means
|
||||
that the interface will be shut down if this instance of /dev/ppp is
|
||||
closed.
|
||||
|
||||
* PPPIOCATTACH attaches this instance to an existing PPP interface.
|
||||
The argument should point to an int containing the unit number.
|
||||
This does not make this instance the owner of the PPP interface.
|
||||
|
||||
* PPPIOCATTCHAN attaches this instance to an existing PPP channel.
|
||||
The argument should point to an int containing the channel number.
|
||||
|
||||
The ioctl calls available on an instance of /dev/ppp attached to a
|
||||
channel are:
|
||||
|
||||
* PPPIOCDETACH detaches the instance from the channel. This ioctl is
|
||||
deprecated since the same effect can be achieved by closing the
|
||||
instance. In order to prevent possible races this ioctl will fail
|
||||
with an EINVAL error if more than one file descriptor refers to this
|
||||
instance (i.e. as a result of dup(), dup2() or fork()).
|
||||
|
||||
* PPPIOCCONNECT connects this channel to a PPP interface. The
|
||||
argument should point to an int containing the interface unit
|
||||
number. It will return an EINVAL error if the channel is already
|
||||
connected to an interface, or ENXIO if the requested interface does
|
||||
not exist.
|
||||
|
||||
* PPPIOCDISCONN disconnects this channel from the PPP interface that
|
||||
it is connected to. It will return an EINVAL error if the channel
|
||||
is not connected to an interface.
|
||||
|
||||
* All other ioctl commands are passed to the channel ioctl() function.
|
||||
|
||||
The ioctl calls that are available on an instance that is attached to
|
||||
an interface unit are:
|
||||
|
||||
* PPPIOCSMRU sets the MRU (maximum receive unit) for the interface.
|
||||
The argument should point to an int containing the new MRU value.
|
||||
|
||||
* PPPIOCSFLAGS sets flags which control the operation of the
|
||||
interface. The argument should be a pointer to an int containing
|
||||
the new flags value. The bits in the flags value that can be set
|
||||
are:
|
||||
SC_COMP_TCP enable transmit TCP header compression
|
||||
SC_NO_TCP_CCID disable connection-id compression for
|
||||
TCP header compression
|
||||
SC_REJ_COMP_TCP disable receive TCP header decompression
|
||||
SC_CCP_OPEN Compression Control Protocol (CCP) is
|
||||
open, so inspect CCP packets
|
||||
SC_CCP_UP CCP is up, may (de)compress packets
|
||||
SC_LOOP_TRAFFIC send IP traffic to pppd
|
||||
SC_MULTILINK enable PPP multilink fragmentation on
|
||||
transmitted packets
|
||||
SC_MP_SHORTSEQ expect short multilink sequence
|
||||
numbers on received multilink fragments
|
||||
SC_MP_XSHORTSEQ transmit short multilink sequence nos.
|
||||
|
||||
The values of these flags are defined in <linux/ppp-ioctl.h>. Note
|
||||
that the values of the SC_MULTILINK, SC_MP_SHORTSEQ and
|
||||
SC_MP_XSHORTSEQ bits are ignored if the CONFIG_PPP_MULTILINK option
|
||||
is not selected.
|
||||
|
||||
* PPPIOCGFLAGS returns the value of the status/control flags for the
|
||||
interface unit. The argument should point to an int where the ioctl
|
||||
will store the flags value. As well as the values listed above for
|
||||
PPPIOCSFLAGS, the following bits may be set in the returned value:
|
||||
SC_COMP_RUN CCP compressor is running
|
||||
SC_DECOMP_RUN CCP decompressor is running
|
||||
SC_DC_ERROR CCP decompressor detected non-fatal error
|
||||
SC_DC_FERROR CCP decompressor detected fatal error
|
||||
|
||||
* PPPIOCSCOMPRESS sets the parameters for packet compression or
|
||||
decompression. The argument should point to a ppp_option_data
|
||||
structure (defined in <linux/ppp-ioctl.h>), which contains a
|
||||
pointer/length pair which should describe a block of memory
|
||||
containing a CCP option specifying a compression method and its
|
||||
parameters. The ppp_option_data struct also contains a `transmit'
|
||||
field. If this is 0, the ioctl will affect the receive path,
|
||||
otherwise the transmit path.
|
||||
|
||||
* PPPIOCGUNIT returns, in the int pointed to by the argument, the unit
|
||||
number of this interface unit.
|
||||
|
||||
* PPPIOCSDEBUG sets the debug flags for the interface to the value in
|
||||
the int pointed to by the argument. Only the least significant bit
|
||||
is used; if this is 1 the generic layer will print some debug
|
||||
messages during its operation. This is only intended for debugging
|
||||
the generic PPP layer code; it is generally not helpful for working
|
||||
out why a PPP connection is failing.
|
||||
|
||||
* PPPIOCGDEBUG returns the debug flags for the interface in the int
|
||||
pointed to by the argument.
|
||||
|
||||
* PPPIOCGIDLE returns the time, in seconds, since the last data
|
||||
packets were sent and received. The argument should point to a
|
||||
ppp_idle structure (defined in <linux/ppp_defs.h>). If the
|
||||
CONFIG_PPP_FILTER option is enabled, the set of packets which reset
|
||||
the transmit and receive idle timers is restricted to those which
|
||||
pass the `active' packet filter.
|
||||
|
||||
* PPPIOCSMAXCID sets the maximum connection-ID parameter (and thus the
|
||||
number of connection slots) for the TCP header compressor and
|
||||
decompressor. The lower 16 bits of the int pointed to by the
|
||||
argument specify the maximum connection-ID for the compressor. If
|
||||
the upper 16 bits of that int are non-zero, they specify the maximum
|
||||
connection-ID for the decompressor, otherwise the decompressor's
|
||||
maximum connection-ID is set to 15.
|
||||
|
||||
* PPPIOCSNPMODE sets the network-protocol mode for a given network
|
||||
protocol. The argument should point to an npioctl struct (defined
|
||||
in <linux/ppp-ioctl.h>). The `protocol' field gives the PPP protocol
|
||||
number for the protocol to be affected, and the `mode' field
|
||||
specifies what to do with packets for that protocol:
|
||||
|
||||
NPMODE_PASS normal operation, transmit and receive packets
|
||||
NPMODE_DROP silently drop packets for this protocol
|
||||
NPMODE_ERROR drop packets and return an error on transmit
|
||||
NPMODE_QUEUE queue up packets for transmit, drop received
|
||||
packets
|
||||
|
||||
At present NPMODE_ERROR and NPMODE_QUEUE have the same effect as
|
||||
NPMODE_DROP.
|
||||
|
||||
* PPPIOCGNPMODE returns the network-protocol mode for a given
|
||||
protocol. The argument should point to an npioctl struct with the
|
||||
`protocol' field set to the PPP protocol number for the protocol of
|
||||
interest. On return the `mode' field will be set to the network-
|
||||
protocol mode for that protocol.
|
||||
|
||||
* PPPIOCSPASS and PPPIOCSACTIVE set the `pass' and `active' packet
|
||||
filters. These ioctls are only available if the CONFIG_PPP_FILTER
|
||||
option is selected. The argument should point to a sock_fprog
|
||||
structure (defined in <linux/filter.h>) containing the compiled BPF
|
||||
instructions for the filter. Packets are dropped if they fail the
|
||||
`pass' filter; otherwise, if they fail the `active' filter they are
|
||||
passed but they do not reset the transmit or receive idle timer.
|
||||
|
||||
* PPPIOCSMRRU enables or disables multilink processing for received
|
||||
packets and sets the multilink MRRU (maximum reconstructed receive
|
||||
unit). The argument should point to an int containing the new MRRU
|
||||
value. If the MRRU value is 0, processing of received multilink
|
||||
fragments is disabled. This ioctl is only available if the
|
||||
CONFIG_PPP_MULTILINK option is selected.
|
||||
|
||||
Last modified: 7-feb-2002
|
||||
48
Documentation/networking/proc_net_tcp.txt
Normal file
48
Documentation/networking/proc_net_tcp.txt
Normal file
|
|
@ -0,0 +1,48 @@
|
|||
This document describes the interfaces /proc/net/tcp and /proc/net/tcp6.
|
||||
Note that these interfaces are deprecated in favor of tcp_diag.
|
||||
|
||||
These /proc interfaces provide information about currently active TCP
|
||||
connections, and are implemented by tcp4_seq_show() in net/ipv4/tcp_ipv4.c
|
||||
and tcp6_seq_show() in net/ipv6/tcp_ipv6.c, respectively.
|
||||
|
||||
It will first list all listening TCP sockets, and next list all established
|
||||
TCP connections. A typical entry of /proc/net/tcp would look like this (split
|
||||
up into 3 parts because of the length of the line):
|
||||
|
||||
46: 010310AC:9C4C 030310AC:1770 01
|
||||
| | | | | |--> connection state
|
||||
| | | | |------> remote TCP port number
|
||||
| | | |-------------> remote IPv4 address
|
||||
| | |--------------------> local TCP port number
|
||||
| |---------------------------> local IPv4 address
|
||||
|----------------------------------> number of entry
|
||||
|
||||
00000150:00000000 01:00000019 00000000
|
||||
| | | | |--> number of unrecovered RTO timeouts
|
||||
| | | |----------> number of jiffies until timer expires
|
||||
| | |----------------> timer_active (see below)
|
||||
| |----------------------> receive-queue
|
||||
|-------------------------------> transmit-queue
|
||||
|
||||
1000 0 54165785 4 cd1e6040 25 4 27 3 -1
|
||||
| | | | | | | | | |--> slow start size threshold,
|
||||
| | | | | | | | | or -1 if the threshold
|
||||
| | | | | | | | | is >= 0xFFFF
|
||||
| | | | | | | | |----> sending congestion window
|
||||
| | | | | | | |-------> (ack.quick<<1)|ack.pingpong
|
||||
| | | | | | |---------> Predicted tick of soft clock
|
||||
| | | | | | (delayed ACK control data)
|
||||
| | | | | |------------> retransmit timeout
|
||||
| | | | |------------------> location of socket in memory
|
||||
| | | |-----------------------> socket reference count
|
||||
| | |-----------------------------> inode
|
||||
| |----------------------------------> unanswered 0-window probes
|
||||
|---------------------------------------------> uid
|
||||
|
||||
timer_active:
|
||||
0 no timer is pending
|
||||
1 retransmit-timer is pending
|
||||
2 another timer (e.g. delayed ack or keepalive) is pending
|
||||
3 this is a socket in TIME_WAIT state. Not all fields will contain
|
||||
data (or even exist)
|
||||
4 zero window probe timer is pending
|
||||
152
Documentation/networking/radiotap-headers.txt
Normal file
152
Documentation/networking/radiotap-headers.txt
Normal file
|
|
@ -0,0 +1,152 @@
|
|||
How to use radiotap headers
|
||||
===========================
|
||||
|
||||
Pointer to the radiotap include file
|
||||
------------------------------------
|
||||
|
||||
Radiotap headers are variable-length and extensible, you can get most of the
|
||||
information you need to know on them from:
|
||||
|
||||
./include/net/ieee80211_radiotap.h
|
||||
|
||||
This document gives an overview and warns on some corner cases.
|
||||
|
||||
|
||||
Structure of the header
|
||||
-----------------------
|
||||
|
||||
There is a fixed portion at the start which contains a u32 bitmap that defines
|
||||
if the possible argument associated with that bit is present or not. So if b0
|
||||
of the it_present member of ieee80211_radiotap_header is set, it means that
|
||||
the header for argument index 0 (IEEE80211_RADIOTAP_TSFT) is present in the
|
||||
argument area.
|
||||
|
||||
< 8-byte ieee80211_radiotap_header >
|
||||
[ <possible argument bitmap extensions ... > ]
|
||||
[ <argument> ... ]
|
||||
|
||||
At the moment there are only 13 possible argument indexes defined, but in case
|
||||
we run out of space in the u32 it_present member, it is defined that b31 set
|
||||
indicates that there is another u32 bitmap following (shown as "possible
|
||||
argument bitmap extensions..." above), and the start of the arguments is moved
|
||||
forward 4 bytes each time.
|
||||
|
||||
Note also that the it_len member __le16 is set to the total number of bytes
|
||||
covered by the ieee80211_radiotap_header and any arguments following.
|
||||
|
||||
|
||||
Requirements for arguments
|
||||
--------------------------
|
||||
|
||||
After the fixed part of the header, the arguments follow for each argument
|
||||
index whose matching bit is set in the it_present member of
|
||||
ieee80211_radiotap_header.
|
||||
|
||||
- the arguments are all stored little-endian!
|
||||
|
||||
- the argument payload for a given argument index has a fixed size. So
|
||||
IEEE80211_RADIOTAP_TSFT being present always indicates an 8-byte argument is
|
||||
present. See the comments in ./include/net/ieee80211_radiotap.h for a nice
|
||||
breakdown of all the argument sizes
|
||||
|
||||
- the arguments must be aligned to a boundary of the argument size using
|
||||
padding. So a u16 argument must start on the next u16 boundary if it isn't
|
||||
already on one, a u32 must start on the next u32 boundary and so on.
|
||||
|
||||
- "alignment" is relative to the start of the ieee80211_radiotap_header, ie,
|
||||
the first byte of the radiotap header. The absolute alignment of that first
|
||||
byte isn't defined. So even if the whole radiotap header is starting at, eg,
|
||||
address 0x00000003, still the first byte of the radiotap header is treated as
|
||||
0 for alignment purposes.
|
||||
|
||||
- the above point that there may be no absolute alignment for multibyte
|
||||
entities in the fixed radiotap header or the argument region means that you
|
||||
have to take special evasive action when trying to access these multibyte
|
||||
entities. Some arches like Blackfin cannot deal with an attempt to
|
||||
dereference, eg, a u16 pointer that is pointing to an odd address. Instead
|
||||
you have to use a kernel API get_unaligned() to dereference the pointer,
|
||||
which will do it bytewise on the arches that require that.
|
||||
|
||||
- The arguments for a given argument index can be a compound of multiple types
|
||||
together. For example IEEE80211_RADIOTAP_CHANNEL has an argument payload
|
||||
consisting of two u16s of total length 4. When this happens, the padding
|
||||
rule is applied dealing with a u16, NOT dealing with a 4-byte single entity.
|
||||
|
||||
|
||||
Example valid radiotap header
|
||||
-----------------------------
|
||||
|
||||
0x00, 0x00, // <-- radiotap version + pad byte
|
||||
0x0b, 0x00, // <- radiotap header length
|
||||
0x04, 0x0c, 0x00, 0x00, // <-- bitmap
|
||||
0x6c, // <-- rate (in 500kHz units)
|
||||
0x0c, //<-- tx power
|
||||
0x01 //<-- antenna
|
||||
|
||||
|
||||
Using the Radiotap Parser
|
||||
-------------------------
|
||||
|
||||
If you are having to parse a radiotap struct, you can radically simplify the
|
||||
job by using the radiotap parser that lives in net/wireless/radiotap.c and has
|
||||
its prototypes available in include/net/cfg80211.h. You use it like this:
|
||||
|
||||
#include <net/cfg80211.h>
|
||||
|
||||
/* buf points to the start of the radiotap header part */
|
||||
|
||||
int MyFunction(u8 * buf, int buflen)
|
||||
{
|
||||
int pkt_rate_100kHz = 0, antenna = 0, pwr = 0;
|
||||
struct ieee80211_radiotap_iterator iterator;
|
||||
int ret = ieee80211_radiotap_iterator_init(&iterator, buf, buflen);
|
||||
|
||||
while (!ret) {
|
||||
|
||||
ret = ieee80211_radiotap_iterator_next(&iterator);
|
||||
|
||||
if (ret)
|
||||
continue;
|
||||
|
||||
/* see if this argument is something we can use */
|
||||
|
||||
switch (iterator.this_arg_index) {
|
||||
/*
|
||||
* You must take care when dereferencing iterator.this_arg
|
||||
* for multibyte types... the pointer is not aligned. Use
|
||||
* get_unaligned((type *)iterator.this_arg) to dereference
|
||||
* iterator.this_arg for type "type" safely on all arches.
|
||||
*/
|
||||
case IEEE80211_RADIOTAP_RATE:
|
||||
/* radiotap "rate" u8 is in
|
||||
* 500kbps units, eg, 0x02=1Mbps
|
||||
*/
|
||||
pkt_rate_100kHz = (*iterator.this_arg) * 5;
|
||||
break;
|
||||
|
||||
case IEEE80211_RADIOTAP_ANTENNA:
|
||||
/* radiotap uses 0 for 1st ant */
|
||||
antenna = *iterator.this_arg);
|
||||
break;
|
||||
|
||||
case IEEE80211_RADIOTAP_DBM_TX_POWER:
|
||||
pwr = *iterator.this_arg;
|
||||
break;
|
||||
|
||||
default:
|
||||
break;
|
||||
}
|
||||
} /* while more rt headers */
|
||||
|
||||
if (ret != -ENOENT)
|
||||
return TXRX_DROP;
|
||||
|
||||
/* discard the radiotap header part */
|
||||
buf += iterator.max_length;
|
||||
buflen -= iterator.max_length;
|
||||
|
||||
...
|
||||
|
||||
}
|
||||
|
||||
Andy Green <andy@warmcat.com>
|
||||
150
Documentation/networking/ray_cs.txt
Normal file
150
Documentation/networking/ray_cs.txt
Normal file
|
|
@ -0,0 +1,150 @@
|
|||
September 21, 1999
|
||||
|
||||
Copyright (c) 1998 Corey Thomas (corey@world.std.com)
|
||||
|
||||
This file is the documentation for the Raylink Wireless LAN card driver for
|
||||
Linux. The Raylink wireless LAN card is a PCMCIA card which provides IEEE
|
||||
802.11 compatible wireless network connectivity at 1 and 2 megabits/second.
|
||||
See http://www.raytheon.com/micro/raylink/ for more information on the Raylink
|
||||
card. This driver is in early development and does have bugs. See the known
|
||||
bugs and limitations at the end of this document for more information.
|
||||
This driver also works with WebGear's Aviator 2.4 and Aviator Pro
|
||||
wireless LAN cards.
|
||||
|
||||
As of kernel 2.3.18, the ray_cs driver is part of the Linux kernel
|
||||
source. My web page for the development of ray_cs is at
|
||||
http://web.ralinktech.com/ralink/Home/Support/Linux.html
|
||||
and I can be emailed at corey@world.std.com
|
||||
|
||||
The kernel driver is based on ray_cs-1.62.tgz
|
||||
|
||||
The driver at my web page is intended to be used as an add on to
|
||||
David Hinds pcmcia package. All the command line parameters are
|
||||
available when compiled as a module. When built into the kernel, only
|
||||
the essid= string parameter is available via the kernel command line.
|
||||
This will change after the method of sorting out parameters for all
|
||||
the PCMCIA drivers is agreed upon. If you must have a built in driver
|
||||
with nondefault parameters, they can be edited in
|
||||
/usr/src/linux/drivers/net/pcmcia/ray_cs.c. Searching for module_param
|
||||
will find them all.
|
||||
|
||||
Information on card services is available at:
|
||||
http://pcmcia-cs.sourceforge.net/
|
||||
|
||||
|
||||
Card services user programs are still required for PCMCIA devices.
|
||||
pcmcia-cs-3.1.1 or greater is required for the kernel version of
|
||||
the driver.
|
||||
|
||||
Currently, ray_cs is not part of David Hinds card services package,
|
||||
so the following magic is required.
|
||||
|
||||
At the end of the /etc/pcmcia/config.opts file, add the line:
|
||||
source ./ray_cs.opts
|
||||
This will make card services read the ray_cs.opts file
|
||||
when starting. Create the file /etc/pcmcia/ray_cs.opts containing the
|
||||
following:
|
||||
|
||||
#### start of /etc/pcmcia/ray_cs.opts ###################
|
||||
# Configuration options for Raylink Wireless LAN PCMCIA card
|
||||
device "ray_cs"
|
||||
class "network" module "misc/ray_cs"
|
||||
|
||||
card "RayLink PC Card WLAN Adapter"
|
||||
manfid 0x01a6, 0x0000
|
||||
bind "ray_cs"
|
||||
|
||||
module "misc/ray_cs" opts ""
|
||||
#### end of /etc/pcmcia/ray_cs.opts #####################
|
||||
|
||||
|
||||
To join an existing network with
|
||||
different parameters, contact the network administrator for the
|
||||
configuration information, and edit /etc/pcmcia/ray_cs.opts.
|
||||
Add the parameters below between the empty quotes.
|
||||
|
||||
Parameters for ray_cs driver which may be specified in ray_cs.opts:
|
||||
|
||||
bc integer 0 = normal mode (802.11 timing)
|
||||
1 = slow down inter frame timing to allow
|
||||
operation with older breezecom access
|
||||
points.
|
||||
|
||||
beacon_period integer beacon period in Kilo-microseconds
|
||||
legal values = must be integer multiple
|
||||
of hop dwell
|
||||
default = 256
|
||||
|
||||
country integer 1 = USA (default)
|
||||
2 = Europe
|
||||
3 = Japan
|
||||
4 = Korea
|
||||
5 = Spain
|
||||
6 = France
|
||||
7 = Israel
|
||||
8 = Australia
|
||||
|
||||
essid string ESS ID - network name to join
|
||||
string with maximum length of 32 chars
|
||||
default value = "ADHOC_ESSID"
|
||||
|
||||
hop_dwell integer hop dwell time in Kilo-microseconds
|
||||
legal values = 16,32,64,128(default),256
|
||||
|
||||
irq_mask integer linux standard 16 bit value 1bit/IRQ
|
||||
lsb is IRQ 0, bit 1 is IRQ 1 etc.
|
||||
Used to restrict choice of IRQ's to use.
|
||||
Recommended method for controlling
|
||||
interrupts is in /etc/pcmcia/config.opts
|
||||
|
||||
net_type integer 0 (default) = adhoc network,
|
||||
1 = infrastructure
|
||||
|
||||
phy_addr string string containing new MAC address in
|
||||
hex, must start with x eg
|
||||
x00008f123456
|
||||
|
||||
psm integer 0 = continuously active
|
||||
1 = power save mode (not useful yet)
|
||||
|
||||
pc_debug integer (0-5) larger values for more verbose
|
||||
logging. Replaces ray_debug.
|
||||
|
||||
ray_debug integer Replaced with pc_debug
|
||||
|
||||
ray_mem_speed integer defaults to 500
|
||||
|
||||
sniffer integer 0 = not sniffer (default)
|
||||
1 = sniffer which can be used to record all
|
||||
network traffic using tcpdump or similar,
|
||||
but no normal network use is allowed.
|
||||
|
||||
translate integer 0 = no translation (encapsulate frames)
|
||||
1 = translation (RFC1042/802.1)
|
||||
|
||||
|
||||
More on sniffer mode:
|
||||
|
||||
tcpdump does not understand 802.11 headers, so it can't
|
||||
interpret the contents, but it can record to a file. This is only
|
||||
useful for debugging 802.11 lowlevel protocols that are not visible to
|
||||
linux. If you want to watch ftp xfers, or do similar things, you
|
||||
don't need to use sniffer mode. Also, some packet types are never
|
||||
sent up by the card, so you will never see them (ack, rts, cts, probe
|
||||
etc.) There is a simple program (showcap) included in the ray_cs
|
||||
package which parses the 802.11 headers.
|
||||
|
||||
Known Problems and missing features
|
||||
|
||||
Does not work with non x86
|
||||
|
||||
Does not work with SMP
|
||||
|
||||
Support for defragmenting frames is not yet debugged, and in
|
||||
fact is known to not work. I have never encountered a net set
|
||||
up to fragment, but still, it should be fixed.
|
||||
|
||||
The ioctl support is incomplete. The hardware address cannot be set
|
||||
using ifconfig yet. If a different hardware address is needed, it may
|
||||
be set using the phy_addr parameter in ray_cs.opts. This requires
|
||||
a card insertion to take effect.
|
||||
356
Documentation/networking/rds.txt
Normal file
356
Documentation/networking/rds.txt
Normal file
|
|
@ -0,0 +1,356 @@
|
|||
|
||||
Overview
|
||||
========
|
||||
|
||||
This readme tries to provide some background on the hows and whys of RDS,
|
||||
and will hopefully help you find your way around the code.
|
||||
|
||||
In addition, please see this email about RDS origins:
|
||||
http://oss.oracle.com/pipermail/rds-devel/2007-November/000228.html
|
||||
|
||||
RDS Architecture
|
||||
================
|
||||
|
||||
RDS provides reliable, ordered datagram delivery by using a single
|
||||
reliable connection between any two nodes in the cluster. This allows
|
||||
applications to use a single socket to talk to any other process in the
|
||||
cluster - so in a cluster with N processes you need N sockets, in contrast
|
||||
to N*N if you use a connection-oriented socket transport like TCP.
|
||||
|
||||
RDS is not Infiniband-specific; it was designed to support different
|
||||
transports. The current implementation used to support RDS over TCP as well
|
||||
as IB. Work is in progress to support RDS over iWARP, and using DCE to
|
||||
guarantee no dropped packets on Ethernet, it may be possible to use RDS over
|
||||
UDP in the future.
|
||||
|
||||
The high-level semantics of RDS from the application's point of view are
|
||||
|
||||
* Addressing
|
||||
RDS uses IPv4 addresses and 16bit port numbers to identify
|
||||
the end point of a connection. All socket operations that involve
|
||||
passing addresses between kernel and user space generally
|
||||
use a struct sockaddr_in.
|
||||
|
||||
The fact that IPv4 addresses are used does not mean the underlying
|
||||
transport has to be IP-based. In fact, RDS over IB uses a
|
||||
reliable IB connection; the IP address is used exclusively to
|
||||
locate the remote node's GID (by ARPing for the given IP).
|
||||
|
||||
The port space is entirely independent of UDP, TCP or any other
|
||||
protocol.
|
||||
|
||||
* Socket interface
|
||||
RDS sockets work *mostly* as you would expect from a BSD
|
||||
socket. The next section will cover the details. At any rate,
|
||||
all I/O is performed through the standard BSD socket API.
|
||||
Some additions like zerocopy support are implemented through
|
||||
control messages, while other extensions use the getsockopt/
|
||||
setsockopt calls.
|
||||
|
||||
Sockets must be bound before you can send or receive data.
|
||||
This is needed because binding also selects a transport and
|
||||
attaches it to the socket. Once bound, the transport assignment
|
||||
does not change. RDS will tolerate IPs moving around (eg in
|
||||
a active-active HA scenario), but only as long as the address
|
||||
doesn't move to a different transport.
|
||||
|
||||
* sysctls
|
||||
RDS supports a number of sysctls in /proc/sys/net/rds
|
||||
|
||||
|
||||
Socket Interface
|
||||
================
|
||||
|
||||
AF_RDS, PF_RDS, SOL_RDS
|
||||
These constants haven't been assigned yet, because RDS isn't in
|
||||
mainline yet. Currently, the kernel module assigns some constant
|
||||
and publishes it to user space through two sysctl files
|
||||
/proc/sys/net/rds/pf_rds
|
||||
/proc/sys/net/rds/sol_rds
|
||||
|
||||
fd = socket(PF_RDS, SOCK_SEQPACKET, 0);
|
||||
This creates a new, unbound RDS socket.
|
||||
|
||||
setsockopt(SOL_SOCKET): send and receive buffer size
|
||||
RDS honors the send and receive buffer size socket options.
|
||||
You are not allowed to queue more than SO_SNDSIZE bytes to
|
||||
a socket. A message is queued when sendmsg is called, and
|
||||
it leaves the queue when the remote system acknowledges
|
||||
its arrival.
|
||||
|
||||
The SO_RCVSIZE option controls the maximum receive queue length.
|
||||
This is a soft limit rather than a hard limit - RDS will
|
||||
continue to accept and queue incoming messages, even if that
|
||||
takes the queue length over the limit. However, it will also
|
||||
mark the port as "congested" and send a congestion update to
|
||||
the source node. The source node is supposed to throttle any
|
||||
processes sending to this congested port.
|
||||
|
||||
bind(fd, &sockaddr_in, ...)
|
||||
This binds the socket to a local IP address and port, and a
|
||||
transport.
|
||||
|
||||
sendmsg(fd, ...)
|
||||
Sends a message to the indicated recipient. The kernel will
|
||||
transparently establish the underlying reliable connection
|
||||
if it isn't up yet.
|
||||
|
||||
An attempt to send a message that exceeds SO_SNDSIZE will
|
||||
return with -EMSGSIZE
|
||||
|
||||
An attempt to send a message that would take the total number
|
||||
of queued bytes over the SO_SNDSIZE threshold will return
|
||||
EAGAIN.
|
||||
|
||||
An attempt to send a message to a destination that is marked
|
||||
as "congested" will return ENOBUFS.
|
||||
|
||||
recvmsg(fd, ...)
|
||||
Receives a message that was queued to this socket. The sockets
|
||||
recv queue accounting is adjusted, and if the queue length
|
||||
drops below SO_SNDSIZE, the port is marked uncongested, and
|
||||
a congestion update is sent to all peers.
|
||||
|
||||
Applications can ask the RDS kernel module to receive
|
||||
notifications via control messages (for instance, there is a
|
||||
notification when a congestion update arrived, or when a RDMA
|
||||
operation completes). These notifications are received through
|
||||
the msg.msg_control buffer of struct msghdr. The format of the
|
||||
messages is described in manpages.
|
||||
|
||||
poll(fd)
|
||||
RDS supports the poll interface to allow the application
|
||||
to implement async I/O.
|
||||
|
||||
POLLIN handling is pretty straightforward. When there's an
|
||||
incoming message queued to the socket, or a pending notification,
|
||||
we signal POLLIN.
|
||||
|
||||
POLLOUT is a little harder. Since you can essentially send
|
||||
to any destination, RDS will always signal POLLOUT as long as
|
||||
there's room on the send queue (ie the number of bytes queued
|
||||
is less than the sendbuf size).
|
||||
|
||||
However, the kernel will refuse to accept messages to
|
||||
a destination marked congested - in this case you will loop
|
||||
forever if you rely on poll to tell you what to do.
|
||||
This isn't a trivial problem, but applications can deal with
|
||||
this - by using congestion notifications, and by checking for
|
||||
ENOBUFS errors returned by sendmsg.
|
||||
|
||||
setsockopt(SOL_RDS, RDS_CANCEL_SENT_TO, &sockaddr_in)
|
||||
This allows the application to discard all messages queued to a
|
||||
specific destination on this particular socket.
|
||||
|
||||
This allows the application to cancel outstanding messages if
|
||||
it detects a timeout. For instance, if it tried to send a message,
|
||||
and the remote host is unreachable, RDS will keep trying forever.
|
||||
The application may decide it's not worth it, and cancel the
|
||||
operation. In this case, it would use RDS_CANCEL_SENT_TO to
|
||||
nuke any pending messages.
|
||||
|
||||
|
||||
RDMA for RDS
|
||||
============
|
||||
|
||||
see rds-rdma(7) manpage (available in rds-tools)
|
||||
|
||||
|
||||
Congestion Notifications
|
||||
========================
|
||||
|
||||
see rds(7) manpage
|
||||
|
||||
|
||||
RDS Protocol
|
||||
============
|
||||
|
||||
Message header
|
||||
|
||||
The message header is a 'struct rds_header' (see rds.h):
|
||||
Fields:
|
||||
h_sequence:
|
||||
per-packet sequence number
|
||||
h_ack:
|
||||
piggybacked acknowledgment of last packet received
|
||||
h_len:
|
||||
length of data, not including header
|
||||
h_sport:
|
||||
source port
|
||||
h_dport:
|
||||
destination port
|
||||
h_flags:
|
||||
CONG_BITMAP - this is a congestion update bitmap
|
||||
ACK_REQUIRED - receiver must ack this packet
|
||||
RETRANSMITTED - packet has previously been sent
|
||||
h_credit:
|
||||
indicate to other end of connection that
|
||||
it has more credits available (i.e. there is
|
||||
more send room)
|
||||
h_padding[4]:
|
||||
unused, for future use
|
||||
h_csum:
|
||||
header checksum
|
||||
h_exthdr:
|
||||
optional data can be passed here. This is currently used for
|
||||
passing RDMA-related information.
|
||||
|
||||
ACK and retransmit handling
|
||||
|
||||
One might think that with reliable IB connections you wouldn't need
|
||||
to ack messages that have been received. The problem is that IB
|
||||
hardware generates an ack message before it has DMAed the message
|
||||
into memory. This creates a potential message loss if the HCA is
|
||||
disabled for any reason between when it sends the ack and before
|
||||
the message is DMAed and processed. This is only a potential issue
|
||||
if another HCA is available for fail-over.
|
||||
|
||||
Sending an ack immediately would allow the sender to free the sent
|
||||
message from their send queue quickly, but could cause excessive
|
||||
traffic to be used for acks. RDS piggybacks acks on sent data
|
||||
packets. Ack-only packets are reduced by only allowing one to be
|
||||
in flight at a time, and by the sender only asking for acks when
|
||||
its send buffers start to fill up. All retransmissions are also
|
||||
acked.
|
||||
|
||||
Flow Control
|
||||
|
||||
RDS's IB transport uses a credit-based mechanism to verify that
|
||||
there is space in the peer's receive buffers for more data. This
|
||||
eliminates the need for hardware retries on the connection.
|
||||
|
||||
Congestion
|
||||
|
||||
Messages waiting in the receive queue on the receiving socket
|
||||
are accounted against the sockets SO_RCVBUF option value. Only
|
||||
the payload bytes in the message are accounted for. If the
|
||||
number of bytes queued equals or exceeds rcvbuf then the socket
|
||||
is congested. All sends attempted to this socket's address
|
||||
should return block or return -EWOULDBLOCK.
|
||||
|
||||
Applications are expected to be reasonably tuned such that this
|
||||
situation very rarely occurs. An application encountering this
|
||||
"back-pressure" is considered a bug.
|
||||
|
||||
This is implemented by having each node maintain bitmaps which
|
||||
indicate which ports on bound addresses are congested. As the
|
||||
bitmap changes it is sent through all the connections which
|
||||
terminate in the local address of the bitmap which changed.
|
||||
|
||||
The bitmaps are allocated as connections are brought up. This
|
||||
avoids allocation in the interrupt handling path which queues
|
||||
sages on sockets. The dense bitmaps let transports send the
|
||||
entire bitmap on any bitmap change reasonably efficiently. This
|
||||
is much easier to implement than some finer-grained
|
||||
communication of per-port congestion. The sender does a very
|
||||
inexpensive bit test to test if the port it's about to send to
|
||||
is congested or not.
|
||||
|
||||
|
||||
RDS Transport Layer
|
||||
==================
|
||||
|
||||
As mentioned above, RDS is not IB-specific. Its code is divided
|
||||
into a general RDS layer and a transport layer.
|
||||
|
||||
The general layer handles the socket API, congestion handling,
|
||||
loopback, stats, usermem pinning, and the connection state machine.
|
||||
|
||||
The transport layer handles the details of the transport. The IB
|
||||
transport, for example, handles all the queue pairs, work requests,
|
||||
CM event handlers, and other Infiniband details.
|
||||
|
||||
|
||||
RDS Kernel Structures
|
||||
=====================
|
||||
|
||||
struct rds_message
|
||||
aka possibly "rds_outgoing", the generic RDS layer copies data to
|
||||
be sent and sets header fields as needed, based on the socket API.
|
||||
This is then queued for the individual connection and sent by the
|
||||
connection's transport.
|
||||
struct rds_incoming
|
||||
a generic struct referring to incoming data that can be handed from
|
||||
the transport to the general code and queued by the general code
|
||||
while the socket is awoken. It is then passed back to the transport
|
||||
code to handle the actual copy-to-user.
|
||||
struct rds_socket
|
||||
per-socket information
|
||||
struct rds_connection
|
||||
per-connection information
|
||||
struct rds_transport
|
||||
pointers to transport-specific functions
|
||||
struct rds_statistics
|
||||
non-transport-specific statistics
|
||||
struct rds_cong_map
|
||||
wraps the raw congestion bitmap, contains rbnode, waitq, etc.
|
||||
|
||||
Connection management
|
||||
=====================
|
||||
|
||||
Connections may be in UP, DOWN, CONNECTING, DISCONNECTING, and
|
||||
ERROR states.
|
||||
|
||||
The first time an attempt is made by an RDS socket to send data to
|
||||
a node, a connection is allocated and connected. That connection is
|
||||
then maintained forever -- if there are transport errors, the
|
||||
connection will be dropped and re-established.
|
||||
|
||||
Dropping a connection while packets are queued will cause queued or
|
||||
partially-sent datagrams to be retransmitted when the connection is
|
||||
re-established.
|
||||
|
||||
|
||||
The send path
|
||||
=============
|
||||
|
||||
rds_sendmsg()
|
||||
struct rds_message built from incoming data
|
||||
CMSGs parsed (e.g. RDMA ops)
|
||||
transport connection alloced and connected if not already
|
||||
rds_message placed on send queue
|
||||
send worker awoken
|
||||
rds_send_worker()
|
||||
calls rds_send_xmit() until queue is empty
|
||||
rds_send_xmit()
|
||||
transmits congestion map if one is pending
|
||||
may set ACK_REQUIRED
|
||||
calls transport to send either non-RDMA or RDMA message
|
||||
(RDMA ops never retransmitted)
|
||||
rds_ib_xmit()
|
||||
allocs work requests from send ring
|
||||
adds any new send credits available to peer (h_credits)
|
||||
maps the rds_message's sg list
|
||||
piggybacks ack
|
||||
populates work requests
|
||||
post send to connection's queue pair
|
||||
|
||||
The recv path
|
||||
=============
|
||||
|
||||
rds_ib_recv_cq_comp_handler()
|
||||
looks at write completions
|
||||
unmaps recv buffer from device
|
||||
no errors, call rds_ib_process_recv()
|
||||
refill recv ring
|
||||
rds_ib_process_recv()
|
||||
validate header checksum
|
||||
copy header to rds_ib_incoming struct if start of a new datagram
|
||||
add to ibinc's fraglist
|
||||
if competed datagram:
|
||||
update cong map if datagram was cong update
|
||||
call rds_recv_incoming() otherwise
|
||||
note if ack is required
|
||||
rds_recv_incoming()
|
||||
drop duplicate packets
|
||||
respond to pings
|
||||
find the sock associated with this datagram
|
||||
add to sock queue
|
||||
wake up sock
|
||||
do some congestion calculations
|
||||
rds_recvmsg
|
||||
copy data into user iovec
|
||||
handle CMSGs
|
||||
return to application
|
||||
|
||||
|
||||
214
Documentation/networking/regulatory.txt
Normal file
214
Documentation/networking/regulatory.txt
Normal file
|
|
@ -0,0 +1,214 @@
|
|||
Linux wireless regulatory documentation
|
||||
---------------------------------------
|
||||
|
||||
This document gives a brief review over how the Linux wireless
|
||||
regulatory infrastructure works.
|
||||
|
||||
More up to date information can be obtained at the project's web page:
|
||||
|
||||
http://wireless.kernel.org/en/developers/Regulatory
|
||||
|
||||
Keeping regulatory domains in userspace
|
||||
---------------------------------------
|
||||
|
||||
Due to the dynamic nature of regulatory domains we keep them
|
||||
in userspace and provide a framework for userspace to upload
|
||||
to the kernel one regulatory domain to be used as the central
|
||||
core regulatory domain all wireless devices should adhere to.
|
||||
|
||||
How to get regulatory domains to the kernel
|
||||
-------------------------------------------
|
||||
|
||||
Userspace gets a regulatory domain in the kernel by having
|
||||
a userspace agent build it and send it via nl80211. Only
|
||||
expected regulatory domains will be respected by the kernel.
|
||||
|
||||
A currently available userspace agent which can accomplish this
|
||||
is CRDA - central regulatory domain agent. Its documented here:
|
||||
|
||||
http://wireless.kernel.org/en/developers/Regulatory/CRDA
|
||||
|
||||
Essentially the kernel will send a udev event when it knows
|
||||
it needs a new regulatory domain. A udev rule can be put in place
|
||||
to trigger crda to send the respective regulatory domain for a
|
||||
specific ISO/IEC 3166 alpha2.
|
||||
|
||||
Below is an example udev rule which can be used:
|
||||
|
||||
# Example file, should be put in /etc/udev/rules.d/regulatory.rules
|
||||
KERNEL=="regulatory*", ACTION=="change", SUBSYSTEM=="platform", RUN+="/sbin/crda"
|
||||
|
||||
The alpha2 is passed as an environment variable under the variable COUNTRY.
|
||||
|
||||
Who asks for regulatory domains?
|
||||
--------------------------------
|
||||
|
||||
* Users
|
||||
|
||||
Users can use iw:
|
||||
|
||||
http://wireless.kernel.org/en/users/Documentation/iw
|
||||
|
||||
An example:
|
||||
|
||||
# set regulatory domain to "Costa Rica"
|
||||
iw reg set CR
|
||||
|
||||
This will request the kernel to set the regulatory domain to
|
||||
the specificied alpha2. The kernel in turn will then ask userspace
|
||||
to provide a regulatory domain for the alpha2 specified by the user
|
||||
by sending a uevent.
|
||||
|
||||
* Wireless subsystems for Country Information elements
|
||||
|
||||
The kernel will send a uevent to inform userspace a new
|
||||
regulatory domain is required. More on this to be added
|
||||
as its integration is added.
|
||||
|
||||
* Drivers
|
||||
|
||||
If drivers determine they need a specific regulatory domain
|
||||
set they can inform the wireless core using regulatory_hint().
|
||||
They have two options -- they either provide an alpha2 so that
|
||||
crda can provide back a regulatory domain for that country or
|
||||
they can build their own regulatory domain based on internal
|
||||
custom knowledge so the wireless core can respect it.
|
||||
|
||||
*Most* drivers will rely on the first mechanism of providing a
|
||||
regulatory hint with an alpha2. For these drivers there is an additional
|
||||
check that can be used to ensure compliance based on custom EEPROM
|
||||
regulatory data. This additional check can be used by drivers by
|
||||
registering on its struct wiphy a reg_notifier() callback. This notifier
|
||||
is called when the core's regulatory domain has been changed. The driver
|
||||
can use this to review the changes made and also review who made them
|
||||
(driver, user, country IE) and determine what to allow based on its
|
||||
internal EEPROM data. Devices drivers wishing to be capable of world
|
||||
roaming should use this callback. More on world roaming will be
|
||||
added to this document when its support is enabled.
|
||||
|
||||
Device drivers who provide their own built regulatory domain
|
||||
do not need a callback as the channels registered by them are
|
||||
the only ones that will be allowed and therefore *additional*
|
||||
channels cannot be enabled.
|
||||
|
||||
Example code - drivers hinting an alpha2:
|
||||
------------------------------------------
|
||||
|
||||
This example comes from the zd1211rw device driver. You can start
|
||||
by having a mapping of your device's EEPROM country/regulatory
|
||||
domain value to a specific alpha2 as follows:
|
||||
|
||||
static struct zd_reg_alpha2_map reg_alpha2_map[] = {
|
||||
{ ZD_REGDOMAIN_FCC, "US" },
|
||||
{ ZD_REGDOMAIN_IC, "CA" },
|
||||
{ ZD_REGDOMAIN_ETSI, "DE" }, /* Generic ETSI, use most restrictive */
|
||||
{ ZD_REGDOMAIN_JAPAN, "JP" },
|
||||
{ ZD_REGDOMAIN_JAPAN_ADD, "JP" },
|
||||
{ ZD_REGDOMAIN_SPAIN, "ES" },
|
||||
{ ZD_REGDOMAIN_FRANCE, "FR" },
|
||||
|
||||
Then you can define a routine to map your read EEPROM value to an alpha2,
|
||||
as follows:
|
||||
|
||||
static int zd_reg2alpha2(u8 regdomain, char *alpha2)
|
||||
{
|
||||
unsigned int i;
|
||||
struct zd_reg_alpha2_map *reg_map;
|
||||
for (i = 0; i < ARRAY_SIZE(reg_alpha2_map); i++) {
|
||||
reg_map = ®_alpha2_map[i];
|
||||
if (regdomain == reg_map->reg) {
|
||||
alpha2[0] = reg_map->alpha2[0];
|
||||
alpha2[1] = reg_map->alpha2[1];
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
Lastly, you can then hint to the core of your discovered alpha2, if a match
|
||||
was found. You need to do this after you have registered your wiphy. You
|
||||
are expected to do this during initialization.
|
||||
|
||||
r = zd_reg2alpha2(mac->regdomain, alpha2);
|
||||
if (!r)
|
||||
regulatory_hint(hw->wiphy, alpha2);
|
||||
|
||||
Example code - drivers providing a built in regulatory domain:
|
||||
--------------------------------------------------------------
|
||||
|
||||
[NOTE: This API is not currently available, it can be added when required]
|
||||
|
||||
If you have regulatory information you can obtain from your
|
||||
driver and you *need* to use this we let you build a regulatory domain
|
||||
structure and pass it to the wireless core. To do this you should
|
||||
kmalloc() a structure big enough to hold your regulatory domain
|
||||
structure and you should then fill it with your data. Finally you simply
|
||||
call regulatory_hint() with the regulatory domain structure in it.
|
||||
|
||||
Bellow is a simple example, with a regulatory domain cached using the stack.
|
||||
Your implementation may vary (read EEPROM cache instead, for example).
|
||||
|
||||
Example cache of some regulatory domain
|
||||
|
||||
struct ieee80211_regdomain mydriver_jp_regdom = {
|
||||
.n_reg_rules = 3,
|
||||
.alpha2 = "JP",
|
||||
//.alpha2 = "99", /* If I have no alpha2 to map it to */
|
||||
.reg_rules = {
|
||||
/* IEEE 802.11b/g, channels 1..14 */
|
||||
REG_RULE(2412-20, 2484+20, 40, 6, 20, 0),
|
||||
/* IEEE 802.11a, channels 34..48 */
|
||||
REG_RULE(5170-20, 5240+20, 40, 6, 20,
|
||||
NL80211_RRF_NO_IR),
|
||||
/* IEEE 802.11a, channels 52..64 */
|
||||
REG_RULE(5260-20, 5320+20, 40, 6, 20,
|
||||
NL80211_RRF_NO_IR|
|
||||
NL80211_RRF_DFS),
|
||||
}
|
||||
};
|
||||
|
||||
Then in some part of your code after your wiphy has been registered:
|
||||
|
||||
struct ieee80211_regdomain *rd;
|
||||
int size_of_regd;
|
||||
int num_rules = mydriver_jp_regdom.n_reg_rules;
|
||||
unsigned int i;
|
||||
|
||||
size_of_regd = sizeof(struct ieee80211_regdomain) +
|
||||
(num_rules * sizeof(struct ieee80211_reg_rule));
|
||||
|
||||
rd = kzalloc(size_of_regd, GFP_KERNEL);
|
||||
if (!rd)
|
||||
return -ENOMEM;
|
||||
|
||||
memcpy(rd, &mydriver_jp_regdom, sizeof(struct ieee80211_regdomain));
|
||||
|
||||
for (i=0; i < num_rules; i++)
|
||||
memcpy(&rd->reg_rules[i],
|
||||
&mydriver_jp_regdom.reg_rules[i],
|
||||
sizeof(struct ieee80211_reg_rule));
|
||||
regulatory_struct_hint(rd);
|
||||
|
||||
Statically compiled regulatory database
|
||||
---------------------------------------
|
||||
|
||||
In most situations the userland solution using CRDA as described
|
||||
above is the preferred solution. However in some cases a set of
|
||||
rules built into the kernel itself may be desirable. To account
|
||||
for this situation, a configuration option has been provided
|
||||
(i.e. CONFIG_CFG80211_INTERNAL_REGDB). With this option enabled,
|
||||
the wireless database information contained in net/wireless/db.txt is
|
||||
used to generate a data structure encoded in net/wireless/regdb.c.
|
||||
That option also enables code in net/wireless/reg.c which queries
|
||||
the data in regdb.c as an alternative to using CRDA.
|
||||
|
||||
The file net/wireless/db.txt should be kept up-to-date with the db.txt
|
||||
file available in the git repository here:
|
||||
|
||||
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-regdb.git
|
||||
|
||||
Again, most users in most situations should be using the CRDA package
|
||||
provided with their distribution, and in most other situations users
|
||||
should be building and using CRDA on their own rather than using
|
||||
this option. If you are not absolutely sure that you should be using
|
||||
CONFIG_CFG80211_INTERNAL_REGDB then _DO_NOT_USE_IT_.
|
||||
947
Documentation/networking/rxrpc.txt
Normal file
947
Documentation/networking/rxrpc.txt
Normal file
|
|
@ -0,0 +1,947 @@
|
|||
======================
|
||||
RxRPC NETWORK PROTOCOL
|
||||
======================
|
||||
|
||||
The RxRPC protocol driver provides a reliable two-phase transport on top of UDP
|
||||
that can be used to perform RxRPC remote operations. This is done over sockets
|
||||
of AF_RXRPC family, using sendmsg() and recvmsg() with control data to send and
|
||||
receive data, aborts and errors.
|
||||
|
||||
Contents of this document:
|
||||
|
||||
(*) Overview.
|
||||
|
||||
(*) RxRPC protocol summary.
|
||||
|
||||
(*) AF_RXRPC driver model.
|
||||
|
||||
(*) Control messages.
|
||||
|
||||
(*) Socket options.
|
||||
|
||||
(*) Security.
|
||||
|
||||
(*) Example client usage.
|
||||
|
||||
(*) Example server usage.
|
||||
|
||||
(*) AF_RXRPC kernel interface.
|
||||
|
||||
(*) Configurable parameters.
|
||||
|
||||
|
||||
========
|
||||
OVERVIEW
|
||||
========
|
||||
|
||||
RxRPC is a two-layer protocol. There is a session layer which provides
|
||||
reliable virtual connections using UDP over IPv4 (or IPv6) as the transport
|
||||
layer, but implements a real network protocol; and there's the presentation
|
||||
layer which renders structured data to binary blobs and back again using XDR
|
||||
(as does SunRPC):
|
||||
|
||||
+-------------+
|
||||
| Application |
|
||||
+-------------+
|
||||
| XDR | Presentation
|
||||
+-------------+
|
||||
| RxRPC | Session
|
||||
+-------------+
|
||||
| UDP | Transport
|
||||
+-------------+
|
||||
|
||||
|
||||
AF_RXRPC provides:
|
||||
|
||||
(1) Part of an RxRPC facility for both kernel and userspace applications by
|
||||
making the session part of it a Linux network protocol (AF_RXRPC).
|
||||
|
||||
(2) A two-phase protocol. The client transmits a blob (the request) and then
|
||||
receives a blob (the reply), and the server receives the request and then
|
||||
transmits the reply.
|
||||
|
||||
(3) Retention of the reusable bits of the transport system set up for one call
|
||||
to speed up subsequent calls.
|
||||
|
||||
(4) A secure protocol, using the Linux kernel's key retention facility to
|
||||
manage security on the client end. The server end must of necessity be
|
||||
more active in security negotiations.
|
||||
|
||||
AF_RXRPC does not provide XDR marshalling/presentation facilities. That is
|
||||
left to the application. AF_RXRPC only deals in blobs. Even the operation ID
|
||||
is just the first four bytes of the request blob, and as such is beyond the
|
||||
kernel's interest.
|
||||
|
||||
|
||||
Sockets of AF_RXRPC family are:
|
||||
|
||||
(1) created as type SOCK_DGRAM;
|
||||
|
||||
(2) provided with a protocol of the type of underlying transport they're going
|
||||
to use - currently only PF_INET is supported.
|
||||
|
||||
|
||||
The Andrew File System (AFS) is an example of an application that uses this and
|
||||
that has both kernel (filesystem) and userspace (utility) components.
|
||||
|
||||
|
||||
======================
|
||||
RXRPC PROTOCOL SUMMARY
|
||||
======================
|
||||
|
||||
An overview of the RxRPC protocol:
|
||||
|
||||
(*) RxRPC sits on top of another networking protocol (UDP is the only option
|
||||
currently), and uses this to provide network transport. UDP ports, for
|
||||
example, provide transport endpoints.
|
||||
|
||||
(*) RxRPC supports multiple virtual "connections" from any given transport
|
||||
endpoint, thus allowing the endpoints to be shared, even to the same
|
||||
remote endpoint.
|
||||
|
||||
(*) Each connection goes to a particular "service". A connection may not go
|
||||
to multiple services. A service may be considered the RxRPC equivalent of
|
||||
a port number. AF_RXRPC permits multiple services to share an endpoint.
|
||||
|
||||
(*) Client-originating packets are marked, thus a transport endpoint can be
|
||||
shared between client and server connections (connections have a
|
||||
direction).
|
||||
|
||||
(*) Up to a billion connections may be supported concurrently between one
|
||||
local transport endpoint and one service on one remote endpoint. An RxRPC
|
||||
connection is described by seven numbers:
|
||||
|
||||
Local address }
|
||||
Local port } Transport (UDP) address
|
||||
Remote address }
|
||||
Remote port }
|
||||
Direction
|
||||
Connection ID
|
||||
Service ID
|
||||
|
||||
(*) Each RxRPC operation is a "call". A connection may make up to four
|
||||
billion calls, but only up to four calls may be in progress on a
|
||||
connection at any one time.
|
||||
|
||||
(*) Calls are two-phase and asymmetric: the client sends its request data,
|
||||
which the service receives; then the service transmits the reply data
|
||||
which the client receives.
|
||||
|
||||
(*) The data blobs are of indefinite size, the end of a phase is marked with a
|
||||
flag in the packet. The number of packets of data making up one blob may
|
||||
not exceed 4 billion, however, as this would cause the sequence number to
|
||||
wrap.
|
||||
|
||||
(*) The first four bytes of the request data are the service operation ID.
|
||||
|
||||
(*) Security is negotiated on a per-connection basis. The connection is
|
||||
initiated by the first data packet on it arriving. If security is
|
||||
requested, the server then issues a "challenge" and then the client
|
||||
replies with a "response". If the response is successful, the security is
|
||||
set for the lifetime of that connection, and all subsequent calls made
|
||||
upon it use that same security. In the event that the server lets a
|
||||
connection lapse before the client, the security will be renegotiated if
|
||||
the client uses the connection again.
|
||||
|
||||
(*) Calls use ACK packets to handle reliability. Data packets are also
|
||||
explicitly sequenced per call.
|
||||
|
||||
(*) There are two types of positive acknowledgment: hard-ACKs and soft-ACKs.
|
||||
A hard-ACK indicates to the far side that all the data received to a point
|
||||
has been received and processed; a soft-ACK indicates that the data has
|
||||
been received but may yet be discarded and re-requested. The sender may
|
||||
not discard any transmittable packets until they've been hard-ACK'd.
|
||||
|
||||
(*) Reception of a reply data packet implicitly hard-ACK's all the data
|
||||
packets that make up the request.
|
||||
|
||||
(*) An call is complete when the request has been sent, the reply has been
|
||||
received and the final hard-ACK on the last packet of the reply has
|
||||
reached the server.
|
||||
|
||||
(*) An call may be aborted by either end at any time up to its completion.
|
||||
|
||||
|
||||
=====================
|
||||
AF_RXRPC DRIVER MODEL
|
||||
=====================
|
||||
|
||||
About the AF_RXRPC driver:
|
||||
|
||||
(*) The AF_RXRPC protocol transparently uses internal sockets of the transport
|
||||
protocol to represent transport endpoints.
|
||||
|
||||
(*) AF_RXRPC sockets map onto RxRPC connection bundles. Actual RxRPC
|
||||
connections are handled transparently. One client socket may be used to
|
||||
make multiple simultaneous calls to the same service. One server socket
|
||||
may handle calls from many clients.
|
||||
|
||||
(*) Additional parallel client connections will be initiated to support extra
|
||||
concurrent calls, up to a tunable limit.
|
||||
|
||||
(*) Each connection is retained for a certain amount of time [tunable] after
|
||||
the last call currently using it has completed in case a new call is made
|
||||
that could reuse it.
|
||||
|
||||
(*) Each internal UDP socket is retained [tunable] for a certain amount of
|
||||
time [tunable] after the last connection using it discarded, in case a new
|
||||
connection is made that could use it.
|
||||
|
||||
(*) A client-side connection is only shared between calls if they have have
|
||||
the same key struct describing their security (and assuming the calls
|
||||
would otherwise share the connection). Non-secured calls would also be
|
||||
able to share connections with each other.
|
||||
|
||||
(*) A server-side connection is shared if the client says it is.
|
||||
|
||||
(*) ACK'ing is handled by the protocol driver automatically, including ping
|
||||
replying.
|
||||
|
||||
(*) SO_KEEPALIVE automatically pings the other side to keep the connection
|
||||
alive [TODO].
|
||||
|
||||
(*) If an ICMP error is received, all calls affected by that error will be
|
||||
aborted with an appropriate network error passed through recvmsg().
|
||||
|
||||
|
||||
Interaction with the user of the RxRPC socket:
|
||||
|
||||
(*) A socket is made into a server socket by binding an address with a
|
||||
non-zero service ID.
|
||||
|
||||
(*) In the client, sending a request is achieved with one or more sendmsgs,
|
||||
followed by the reply being received with one or more recvmsgs.
|
||||
|
||||
(*) The first sendmsg for a request to be sent from a client contains a tag to
|
||||
be used in all other sendmsgs or recvmsgs associated with that call. The
|
||||
tag is carried in the control data.
|
||||
|
||||
(*) connect() is used to supply a default destination address for a client
|
||||
socket. This may be overridden by supplying an alternate address to the
|
||||
first sendmsg() of a call (struct msghdr::msg_name).
|
||||
|
||||
(*) If connect() is called on an unbound client, a random local port will
|
||||
bound before the operation takes place.
|
||||
|
||||
(*) A server socket may also be used to make client calls. To do this, the
|
||||
first sendmsg() of the call must specify the target address. The server's
|
||||
transport endpoint is used to send the packets.
|
||||
|
||||
(*) Once the application has received the last message associated with a call,
|
||||
the tag is guaranteed not to be seen again, and so it can be used to pin
|
||||
client resources. A new call can then be initiated with the same tag
|
||||
without fear of interference.
|
||||
|
||||
(*) In the server, a request is received with one or more recvmsgs, then the
|
||||
the reply is transmitted with one or more sendmsgs, and then the final ACK
|
||||
is received with a last recvmsg.
|
||||
|
||||
(*) When sending data for a call, sendmsg is given MSG_MORE if there's more
|
||||
data to come on that call.
|
||||
|
||||
(*) When receiving data for a call, recvmsg flags MSG_MORE if there's more
|
||||
data to come for that call.
|
||||
|
||||
(*) When receiving data or messages for a call, MSG_EOR is flagged by recvmsg
|
||||
to indicate the terminal message for that call.
|
||||
|
||||
(*) A call may be aborted by adding an abort control message to the control
|
||||
data. Issuing an abort terminates the kernel's use of that call's tag.
|
||||
Any messages waiting in the receive queue for that call will be discarded.
|
||||
|
||||
(*) Aborts, busy notifications and challenge packets are delivered by recvmsg,
|
||||
and control data messages will be set to indicate the context. Receiving
|
||||
an abort or a busy message terminates the kernel's use of that call's tag.
|
||||
|
||||
(*) The control data part of the msghdr struct is used for a number of things:
|
||||
|
||||
(*) The tag of the intended or affected call.
|
||||
|
||||
(*) Sending or receiving errors, aborts and busy notifications.
|
||||
|
||||
(*) Notifications of incoming calls.
|
||||
|
||||
(*) Sending debug requests and receiving debug replies [TODO].
|
||||
|
||||
(*) When the kernel has received and set up an incoming call, it sends a
|
||||
message to server application to let it know there's a new call awaiting
|
||||
its acceptance [recvmsg reports a special control message]. The server
|
||||
application then uses sendmsg to assign a tag to the new call. Once that
|
||||
is done, the first part of the request data will be delivered by recvmsg.
|
||||
|
||||
(*) The server application has to provide the server socket with a keyring of
|
||||
secret keys corresponding to the security types it permits. When a secure
|
||||
connection is being set up, the kernel looks up the appropriate secret key
|
||||
in the keyring and then sends a challenge packet to the client and
|
||||
receives a response packet. The kernel then checks the authorisation of
|
||||
the packet and either aborts the connection or sets up the security.
|
||||
|
||||
(*) The name of the key a client will use to secure its communications is
|
||||
nominated by a socket option.
|
||||
|
||||
|
||||
Notes on recvmsg:
|
||||
|
||||
(*) If there's a sequence of data messages belonging to a particular call on
|
||||
the receive queue, then recvmsg will keep working through them until:
|
||||
|
||||
(a) it meets the end of that call's received data,
|
||||
|
||||
(b) it meets a non-data message,
|
||||
|
||||
(c) it meets a message belonging to a different call, or
|
||||
|
||||
(d) it fills the user buffer.
|
||||
|
||||
If recvmsg is called in blocking mode, it will keep sleeping, awaiting the
|
||||
reception of further data, until one of the above four conditions is met.
|
||||
|
||||
(2) MSG_PEEK operates similarly, but will return immediately if it has put any
|
||||
data in the buffer rather than sleeping until it can fill the buffer.
|
||||
|
||||
(3) If a data message is only partially consumed in filling a user buffer,
|
||||
then the remainder of that message will be left on the front of the queue
|
||||
for the next taker. MSG_TRUNC will never be flagged.
|
||||
|
||||
(4) If there is more data to be had on a call (it hasn't copied the last byte
|
||||
of the last data message in that phase yet), then MSG_MORE will be
|
||||
flagged.
|
||||
|
||||
|
||||
================
|
||||
CONTROL MESSAGES
|
||||
================
|
||||
|
||||
AF_RXRPC makes use of control messages in sendmsg() and recvmsg() to multiplex
|
||||
calls, to invoke certain actions and to report certain conditions. These are:
|
||||
|
||||
MESSAGE ID SRT DATA MEANING
|
||||
======================= === =========== ===============================
|
||||
RXRPC_USER_CALL_ID sr- User ID App's call specifier
|
||||
RXRPC_ABORT srt Abort code Abort code to issue/received
|
||||
RXRPC_ACK -rt n/a Final ACK received
|
||||
RXRPC_NET_ERROR -rt error num Network error on call
|
||||
RXRPC_BUSY -rt n/a Call rejected (server busy)
|
||||
RXRPC_LOCAL_ERROR -rt error num Local error encountered
|
||||
RXRPC_NEW_CALL -r- n/a New call received
|
||||
RXRPC_ACCEPT s-- n/a Accept new call
|
||||
|
||||
(SRT = usable in Sendmsg / delivered by Recvmsg / Terminal message)
|
||||
|
||||
(*) RXRPC_USER_CALL_ID
|
||||
|
||||
This is used to indicate the application's call ID. It's an unsigned long
|
||||
that the app specifies in the client by attaching it to the first data
|
||||
message or in the server by passing it in association with an RXRPC_ACCEPT
|
||||
message. recvmsg() passes it in conjunction with all messages except
|
||||
those of the RXRPC_NEW_CALL message.
|
||||
|
||||
(*) RXRPC_ABORT
|
||||
|
||||
This is can be used by an application to abort a call by passing it to
|
||||
sendmsg, or it can be delivered by recvmsg to indicate a remote abort was
|
||||
received. Either way, it must be associated with an RXRPC_USER_CALL_ID to
|
||||
specify the call affected. If an abort is being sent, then error EBADSLT
|
||||
will be returned if there is no call with that user ID.
|
||||
|
||||
(*) RXRPC_ACK
|
||||
|
||||
This is delivered to a server application to indicate that the final ACK
|
||||
of a call was received from the client. It will be associated with an
|
||||
RXRPC_USER_CALL_ID to indicate the call that's now complete.
|
||||
|
||||
(*) RXRPC_NET_ERROR
|
||||
|
||||
This is delivered to an application to indicate that an ICMP error message
|
||||
was encountered in the process of trying to talk to the peer. An
|
||||
errno-class integer value will be included in the control message data
|
||||
indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
|
||||
affected.
|
||||
|
||||
(*) RXRPC_BUSY
|
||||
|
||||
This is delivered to a client application to indicate that a call was
|
||||
rejected by the server due to the server being busy. It will be
|
||||
associated with an RXRPC_USER_CALL_ID to indicate the rejected call.
|
||||
|
||||
(*) RXRPC_LOCAL_ERROR
|
||||
|
||||
This is delivered to an application to indicate that a local error was
|
||||
encountered and that a call has been aborted because of it. An
|
||||
errno-class integer value will be included in the control message data
|
||||
indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
|
||||
affected.
|
||||
|
||||
(*) RXRPC_NEW_CALL
|
||||
|
||||
This is delivered to indicate to a server application that a new call has
|
||||
arrived and is awaiting acceptance. No user ID is associated with this,
|
||||
as a user ID must subsequently be assigned by doing an RXRPC_ACCEPT.
|
||||
|
||||
(*) RXRPC_ACCEPT
|
||||
|
||||
This is used by a server application to attempt to accept a call and
|
||||
assign it a user ID. It should be associated with an RXRPC_USER_CALL_ID
|
||||
to indicate the user ID to be assigned. If there is no call to be
|
||||
accepted (it may have timed out, been aborted, etc.), then sendmsg will
|
||||
return error ENODATA. If the user ID is already in use by another call,
|
||||
then error EBADSLT will be returned.
|
||||
|
||||
|
||||
==============
|
||||
SOCKET OPTIONS
|
||||
==============
|
||||
|
||||
AF_RXRPC sockets support a few socket options at the SOL_RXRPC level:
|
||||
|
||||
(*) RXRPC_SECURITY_KEY
|
||||
|
||||
This is used to specify the description of the key to be used. The key is
|
||||
extracted from the calling process's keyrings with request_key() and
|
||||
should be of "rxrpc" type.
|
||||
|
||||
The optval pointer points to the description string, and optlen indicates
|
||||
how long the string is, without the NUL terminator.
|
||||
|
||||
(*) RXRPC_SECURITY_KEYRING
|
||||
|
||||
Similar to above but specifies a keyring of server secret keys to use (key
|
||||
type "keyring"). See the "Security" section.
|
||||
|
||||
(*) RXRPC_EXCLUSIVE_CONNECTION
|
||||
|
||||
This is used to request that new connections should be used for each call
|
||||
made subsequently on this socket. optval should be NULL and optlen 0.
|
||||
|
||||
(*) RXRPC_MIN_SECURITY_LEVEL
|
||||
|
||||
This is used to specify the minimum security level required for calls on
|
||||
this socket. optval must point to an int containing one of the following
|
||||
values:
|
||||
|
||||
(a) RXRPC_SECURITY_PLAIN
|
||||
|
||||
Encrypted checksum only.
|
||||
|
||||
(b) RXRPC_SECURITY_AUTH
|
||||
|
||||
Encrypted checksum plus packet padded and first eight bytes of packet
|
||||
encrypted - which includes the actual packet length.
|
||||
|
||||
(c) RXRPC_SECURITY_ENCRYPTED
|
||||
|
||||
Encrypted checksum plus entire packet padded and encrypted, including
|
||||
actual packet length.
|
||||
|
||||
|
||||
========
|
||||
SECURITY
|
||||
========
|
||||
|
||||
Currently, only the kerberos 4 equivalent protocol has been implemented
|
||||
(security index 2 - rxkad). This requires the rxkad module to be loaded and,
|
||||
on the client, tickets of the appropriate type to be obtained from the AFS
|
||||
kaserver or the kerberos server and installed as "rxrpc" type keys. This is
|
||||
normally done using the klog program. An example simple klog program can be
|
||||
found at:
|
||||
|
||||
http://people.redhat.com/~dhowells/rxrpc/klog.c
|
||||
|
||||
The payload provided to add_key() on the client should be of the following
|
||||
form:
|
||||
|
||||
struct rxrpc_key_sec2_v1 {
|
||||
uint16_t security_index; /* 2 */
|
||||
uint16_t ticket_length; /* length of ticket[] */
|
||||
uint32_t expiry; /* time at which expires */
|
||||
uint8_t kvno; /* key version number */
|
||||
uint8_t __pad[3];
|
||||
uint8_t session_key[8]; /* DES session key */
|
||||
uint8_t ticket[0]; /* the encrypted ticket */
|
||||
};
|
||||
|
||||
Where the ticket blob is just appended to the above structure.
|
||||
|
||||
|
||||
For the server, keys of type "rxrpc_s" must be made available to the server.
|
||||
They have a description of "<serviceID>:<securityIndex>" (eg: "52:2" for an
|
||||
rxkad key for the AFS VL service). When such a key is created, it should be
|
||||
given the server's secret key as the instantiation data (see the example
|
||||
below).
|
||||
|
||||
add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
|
||||
|
||||
A keyring is passed to the server socket by naming it in a sockopt. The server
|
||||
socket then looks the server secret keys up in this keyring when secure
|
||||
incoming connections are made. This can be seen in an example program that can
|
||||
be found at:
|
||||
|
||||
http://people.redhat.com/~dhowells/rxrpc/listen.c
|
||||
|
||||
|
||||
====================
|
||||
EXAMPLE CLIENT USAGE
|
||||
====================
|
||||
|
||||
A client would issue an operation by:
|
||||
|
||||
(1) An RxRPC socket is set up by:
|
||||
|
||||
client = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
|
||||
|
||||
Where the third parameter indicates the protocol family of the transport
|
||||
socket used - usually IPv4 but it can also be IPv6 [TODO].
|
||||
|
||||
(2) A local address can optionally be bound:
|
||||
|
||||
struct sockaddr_rxrpc srx = {
|
||||
.srx_family = AF_RXRPC,
|
||||
.srx_service = 0, /* we're a client */
|
||||
.transport_type = SOCK_DGRAM, /* type of transport socket */
|
||||
.transport.sin_family = AF_INET,
|
||||
.transport.sin_port = htons(7000), /* AFS callback */
|
||||
.transport.sin_address = 0, /* all local interfaces */
|
||||
};
|
||||
bind(client, &srx, sizeof(srx));
|
||||
|
||||
This specifies the local UDP port to be used. If not given, a random
|
||||
non-privileged port will be used. A UDP port may be shared between
|
||||
several unrelated RxRPC sockets. Security is handled on a basis of
|
||||
per-RxRPC virtual connection.
|
||||
|
||||
(3) The security is set:
|
||||
|
||||
const char *key = "AFS:cambridge.redhat.com";
|
||||
setsockopt(client, SOL_RXRPC, RXRPC_SECURITY_KEY, key, strlen(key));
|
||||
|
||||
This issues a request_key() to get the key representing the security
|
||||
context. The minimum security level can be set:
|
||||
|
||||
unsigned int sec = RXRPC_SECURITY_ENCRYPTED;
|
||||
setsockopt(client, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL,
|
||||
&sec, sizeof(sec));
|
||||
|
||||
(4) The server to be contacted can then be specified (alternatively this can
|
||||
be done through sendmsg):
|
||||
|
||||
struct sockaddr_rxrpc srx = {
|
||||
.srx_family = AF_RXRPC,
|
||||
.srx_service = VL_SERVICE_ID,
|
||||
.transport_type = SOCK_DGRAM, /* type of transport socket */
|
||||
.transport.sin_family = AF_INET,
|
||||
.transport.sin_port = htons(7005), /* AFS volume manager */
|
||||
.transport.sin_address = ...,
|
||||
};
|
||||
connect(client, &srx, sizeof(srx));
|
||||
|
||||
(5) The request data should then be posted to the server socket using a series
|
||||
of sendmsg() calls, each with the following control message attached:
|
||||
|
||||
RXRPC_USER_CALL_ID - specifies the user ID for this call
|
||||
|
||||
MSG_MORE should be set in msghdr::msg_flags on all but the last part of
|
||||
the request. Multiple requests may be made simultaneously.
|
||||
|
||||
If a call is intended to go to a destination other than the default
|
||||
specified through connect(), then msghdr::msg_name should be set on the
|
||||
first request message of that call.
|
||||
|
||||
(6) The reply data will then be posted to the server socket for recvmsg() to
|
||||
pick up. MSG_MORE will be flagged by recvmsg() if there's more reply data
|
||||
for a particular call to be read. MSG_EOR will be set on the terminal
|
||||
read for a call.
|
||||
|
||||
All data will be delivered with the following control message attached:
|
||||
|
||||
RXRPC_USER_CALL_ID - specifies the user ID for this call
|
||||
|
||||
If an abort or error occurred, this will be returned in the control data
|
||||
buffer instead, and MSG_EOR will be flagged to indicate the end of that
|
||||
call.
|
||||
|
||||
|
||||
====================
|
||||
EXAMPLE SERVER USAGE
|
||||
====================
|
||||
|
||||
A server would be set up to accept operations in the following manner:
|
||||
|
||||
(1) An RxRPC socket is created by:
|
||||
|
||||
server = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
|
||||
|
||||
Where the third parameter indicates the address type of the transport
|
||||
socket used - usually IPv4.
|
||||
|
||||
(2) Security is set up if desired by giving the socket a keyring with server
|
||||
secret keys in it:
|
||||
|
||||
keyring = add_key("keyring", "AFSkeys", NULL, 0,
|
||||
KEY_SPEC_PROCESS_KEYRING);
|
||||
|
||||
const char secret_key[8] = {
|
||||
0xa7, 0x83, 0x8a, 0xcb, 0xc7, 0x83, 0xec, 0x94 };
|
||||
add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
|
||||
|
||||
setsockopt(server, SOL_RXRPC, RXRPC_SECURITY_KEYRING, "AFSkeys", 7);
|
||||
|
||||
The keyring can be manipulated after it has been given to the socket. This
|
||||
permits the server to add more keys, replace keys, etc. whilst it is live.
|
||||
|
||||
(2) A local address must then be bound:
|
||||
|
||||
struct sockaddr_rxrpc srx = {
|
||||
.srx_family = AF_RXRPC,
|
||||
.srx_service = VL_SERVICE_ID, /* RxRPC service ID */
|
||||
.transport_type = SOCK_DGRAM, /* type of transport socket */
|
||||
.transport.sin_family = AF_INET,
|
||||
.transport.sin_port = htons(7000), /* AFS callback */
|
||||
.transport.sin_address = 0, /* all local interfaces */
|
||||
};
|
||||
bind(server, &srx, sizeof(srx));
|
||||
|
||||
(3) The server is then set to listen out for incoming calls:
|
||||
|
||||
listen(server, 100);
|
||||
|
||||
(4) The kernel notifies the server of pending incoming connections by sending
|
||||
it a message for each. This is received with recvmsg() on the server
|
||||
socket. It has no data, and has a single dataless control message
|
||||
attached:
|
||||
|
||||
RXRPC_NEW_CALL
|
||||
|
||||
The address that can be passed back by recvmsg() at this point should be
|
||||
ignored since the call for which the message was posted may have gone by
|
||||
the time it is accepted - in which case the first call still on the queue
|
||||
will be accepted.
|
||||
|
||||
(5) The server then accepts the new call by issuing a sendmsg() with two
|
||||
pieces of control data and no actual data:
|
||||
|
||||
RXRPC_ACCEPT - indicate connection acceptance
|
||||
RXRPC_USER_CALL_ID - specify user ID for this call
|
||||
|
||||
(6) The first request data packet will then be posted to the server socket for
|
||||
recvmsg() to pick up. At that point, the RxRPC address for the call can
|
||||
be read from the address fields in the msghdr struct.
|
||||
|
||||
Subsequent request data will be posted to the server socket for recvmsg()
|
||||
to collect as it arrives. All but the last piece of the request data will
|
||||
be delivered with MSG_MORE flagged.
|
||||
|
||||
All data will be delivered with the following control message attached:
|
||||
|
||||
RXRPC_USER_CALL_ID - specifies the user ID for this call
|
||||
|
||||
(8) The reply data should then be posted to the server socket using a series
|
||||
of sendmsg() calls, each with the following control messages attached:
|
||||
|
||||
RXRPC_USER_CALL_ID - specifies the user ID for this call
|
||||
|
||||
MSG_MORE should be set in msghdr::msg_flags on all but the last message
|
||||
for a particular call.
|
||||
|
||||
(9) The final ACK from the client will be posted for retrieval by recvmsg()
|
||||
when it is received. It will take the form of a dataless message with two
|
||||
control messages attached:
|
||||
|
||||
RXRPC_USER_CALL_ID - specifies the user ID for this call
|
||||
RXRPC_ACK - indicates final ACK (no data)
|
||||
|
||||
MSG_EOR will be flagged to indicate that this is the final message for
|
||||
this call.
|
||||
|
||||
(10) Up to the point the final packet of reply data is sent, the call can be
|
||||
aborted by calling sendmsg() with a dataless message with the following
|
||||
control messages attached:
|
||||
|
||||
RXRPC_USER_CALL_ID - specifies the user ID for this call
|
||||
RXRPC_ABORT - indicates abort code (4 byte data)
|
||||
|
||||
Any packets waiting in the socket's receive queue will be discarded if
|
||||
this is issued.
|
||||
|
||||
Note that all the communications for a particular service take place through
|
||||
the one server socket, using control messages on sendmsg() and recvmsg() to
|
||||
determine the call affected.
|
||||
|
||||
|
||||
=========================
|
||||
AF_RXRPC KERNEL INTERFACE
|
||||
=========================
|
||||
|
||||
The AF_RXRPC module also provides an interface for use by in-kernel utilities
|
||||
such as the AFS filesystem. This permits such a utility to:
|
||||
|
||||
(1) Use different keys directly on individual client calls on one socket
|
||||
rather than having to open a whole slew of sockets, one for each key it
|
||||
might want to use.
|
||||
|
||||
(2) Avoid having RxRPC call request_key() at the point of issue of a call or
|
||||
opening of a socket. Instead the utility is responsible for requesting a
|
||||
key at the appropriate point. AFS, for instance, would do this during VFS
|
||||
operations such as open() or unlink(). The key is then handed through
|
||||
when the call is initiated.
|
||||
|
||||
(3) Request the use of something other than GFP_KERNEL to allocate memory.
|
||||
|
||||
(4) Avoid the overhead of using the recvmsg() call. RxRPC messages can be
|
||||
intercepted before they get put into the socket Rx queue and the socket
|
||||
buffers manipulated directly.
|
||||
|
||||
To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
|
||||
bind an address as appropriate and listen if it's to be a server socket, but
|
||||
then it passes this to the kernel interface functions.
|
||||
|
||||
The kernel interface functions are as follows:
|
||||
|
||||
(*) Begin a new client call.
|
||||
|
||||
struct rxrpc_call *
|
||||
rxrpc_kernel_begin_call(struct socket *sock,
|
||||
struct sockaddr_rxrpc *srx,
|
||||
struct key *key,
|
||||
unsigned long user_call_ID,
|
||||
gfp_t gfp);
|
||||
|
||||
This allocates the infrastructure to make a new RxRPC call and assigns
|
||||
call and connection numbers. The call will be made on the UDP port that
|
||||
the socket is bound to. The call will go to the destination address of a
|
||||
connected client socket unless an alternative is supplied (srx is
|
||||
non-NULL).
|
||||
|
||||
If a key is supplied then this will be used to secure the call instead of
|
||||
the key bound to the socket with the RXRPC_SECURITY_KEY sockopt. Calls
|
||||
secured in this way will still share connections if at all possible.
|
||||
|
||||
The user_call_ID is equivalent to that supplied to sendmsg() in the
|
||||
control data buffer. It is entirely feasible to use this to point to a
|
||||
kernel data structure.
|
||||
|
||||
If this function is successful, an opaque reference to the RxRPC call is
|
||||
returned. The caller now holds a reference on this and it must be
|
||||
properly ended.
|
||||
|
||||
(*) End a client call.
|
||||
|
||||
void rxrpc_kernel_end_call(struct rxrpc_call *call);
|
||||
|
||||
This is used to end a previously begun call. The user_call_ID is expunged
|
||||
from AF_RXRPC's knowledge and will not be seen again in association with
|
||||
the specified call.
|
||||
|
||||
(*) Send data through a call.
|
||||
|
||||
int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg,
|
||||
size_t len);
|
||||
|
||||
This is used to supply either the request part of a client call or the
|
||||
reply part of a server call. msg.msg_iovlen and msg.msg_iov specify the
|
||||
data buffers to be used. msg_iov may not be NULL and must point
|
||||
exclusively to in-kernel virtual addresses. msg.msg_flags may be given
|
||||
MSG_MORE if there will be subsequent data sends for this call.
|
||||
|
||||
The msg must not specify a destination address, control data or any flags
|
||||
other than MSG_MORE. len is the total amount of data to transmit.
|
||||
|
||||
(*) Abort a call.
|
||||
|
||||
void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code);
|
||||
|
||||
This is used to abort a call if it's still in an abortable state. The
|
||||
abort code specified will be placed in the ABORT message sent.
|
||||
|
||||
(*) Intercept received RxRPC messages.
|
||||
|
||||
typedef void (*rxrpc_interceptor_t)(struct sock *sk,
|
||||
unsigned long user_call_ID,
|
||||
struct sk_buff *skb);
|
||||
|
||||
void
|
||||
rxrpc_kernel_intercept_rx_messages(struct socket *sock,
|
||||
rxrpc_interceptor_t interceptor);
|
||||
|
||||
This installs an interceptor function on the specified AF_RXRPC socket.
|
||||
All messages that would otherwise wind up in the socket's Rx queue are
|
||||
then diverted to this function. Note that care must be taken to process
|
||||
the messages in the right order to maintain DATA message sequentiality.
|
||||
|
||||
The interceptor function itself is provided with the address of the socket
|
||||
and handling the incoming message, the ID assigned by the kernel utility
|
||||
to the call and the socket buffer containing the message.
|
||||
|
||||
The skb->mark field indicates the type of message:
|
||||
|
||||
MARK MEANING
|
||||
=============================== =======================================
|
||||
RXRPC_SKB_MARK_DATA Data message
|
||||
RXRPC_SKB_MARK_FINAL_ACK Final ACK received for an incoming call
|
||||
RXRPC_SKB_MARK_BUSY Client call rejected as server busy
|
||||
RXRPC_SKB_MARK_REMOTE_ABORT Call aborted by peer
|
||||
RXRPC_SKB_MARK_NET_ERROR Network error detected
|
||||
RXRPC_SKB_MARK_LOCAL_ERROR Local error encountered
|
||||
RXRPC_SKB_MARK_NEW_CALL New incoming call awaiting acceptance
|
||||
|
||||
The remote abort message can be probed with rxrpc_kernel_get_abort_code().
|
||||
The two error messages can be probed with rxrpc_kernel_get_error_number().
|
||||
A new call can be accepted with rxrpc_kernel_accept_call().
|
||||
|
||||
Data messages can have their contents extracted with the usual bunch of
|
||||
socket buffer manipulation functions. A data message can be determined to
|
||||
be the last one in a sequence with rxrpc_kernel_is_data_last(). When a
|
||||
data message has been used up, rxrpc_kernel_data_delivered() should be
|
||||
called on it..
|
||||
|
||||
Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose
|
||||
of. It is possible to get extra refs on all types of message for later
|
||||
freeing, but this may pin the state of a call until the message is finally
|
||||
freed.
|
||||
|
||||
(*) Accept an incoming call.
|
||||
|
||||
struct rxrpc_call *
|
||||
rxrpc_kernel_accept_call(struct socket *sock,
|
||||
unsigned long user_call_ID);
|
||||
|
||||
This is used to accept an incoming call and to assign it a call ID. This
|
||||
function is similar to rxrpc_kernel_begin_call() and calls accepted must
|
||||
be ended in the same way.
|
||||
|
||||
If this function is successful, an opaque reference to the RxRPC call is
|
||||
returned. The caller now holds a reference on this and it must be
|
||||
properly ended.
|
||||
|
||||
(*) Reject an incoming call.
|
||||
|
||||
int rxrpc_kernel_reject_call(struct socket *sock);
|
||||
|
||||
This is used to reject the first incoming call on the socket's queue with
|
||||
a BUSY message. -ENODATA is returned if there were no incoming calls.
|
||||
Other errors may be returned if the call had been aborted (-ECONNABORTED)
|
||||
or had timed out (-ETIME).
|
||||
|
||||
(*) Record the delivery of a data message and free it.
|
||||
|
||||
void rxrpc_kernel_data_delivered(struct sk_buff *skb);
|
||||
|
||||
This is used to record a data message as having been delivered and to
|
||||
update the ACK state for the call. The socket buffer will be freed.
|
||||
|
||||
(*) Free a message.
|
||||
|
||||
void rxrpc_kernel_free_skb(struct sk_buff *skb);
|
||||
|
||||
This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC
|
||||
socket.
|
||||
|
||||
(*) Determine if a data message is the last one on a call.
|
||||
|
||||
bool rxrpc_kernel_is_data_last(struct sk_buff *skb);
|
||||
|
||||
This is used to determine if a socket buffer holds the last data message
|
||||
to be received for a call (true will be returned if it does, false
|
||||
if not).
|
||||
|
||||
The data message will be part of the reply on a client call and the
|
||||
request on an incoming call. In the latter case there will be more
|
||||
messages, but in the former case there will not.
|
||||
|
||||
(*) Get the abort code from an abort message.
|
||||
|
||||
u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb);
|
||||
|
||||
This is used to extract the abort code from a remote abort message.
|
||||
|
||||
(*) Get the error number from a local or network error message.
|
||||
|
||||
int rxrpc_kernel_get_error_number(struct sk_buff *skb);
|
||||
|
||||
This is used to extract the error number from a message indicating either
|
||||
a local error occurred or a network error occurred.
|
||||
|
||||
(*) Allocate a null key for doing anonymous security.
|
||||
|
||||
struct key *rxrpc_get_null_key(const char *keyname);
|
||||
|
||||
This is used to allocate a null RxRPC key that can be used to indicate
|
||||
anonymous security for a particular domain.
|
||||
|
||||
|
||||
=======================
|
||||
CONFIGURABLE PARAMETERS
|
||||
=======================
|
||||
|
||||
The RxRPC protocol driver has a number of configurable parameters that can be
|
||||
adjusted through sysctls in /proc/net/rxrpc/:
|
||||
|
||||
(*) req_ack_delay
|
||||
|
||||
The amount of time in milliseconds after receiving a packet with the
|
||||
request-ack flag set before we honour the flag and actually send the
|
||||
requested ack.
|
||||
|
||||
Usually the other side won't stop sending packets until the advertised
|
||||
reception window is full (to a maximum of 255 packets), so delaying the
|
||||
ACK permits several packets to be ACK'd in one go.
|
||||
|
||||
(*) soft_ack_delay
|
||||
|
||||
The amount of time in milliseconds after receiving a new packet before we
|
||||
generate a soft-ACK to tell the sender that it doesn't need to resend.
|
||||
|
||||
(*) idle_ack_delay
|
||||
|
||||
The amount of time in milliseconds after all the packets currently in the
|
||||
received queue have been consumed before we generate a hard-ACK to tell
|
||||
the sender it can free its buffers, assuming no other reason occurs that
|
||||
we would send an ACK.
|
||||
|
||||
(*) resend_timeout
|
||||
|
||||
The amount of time in milliseconds after transmitting a packet before we
|
||||
transmit it again, assuming no ACK is received from the receiver telling
|
||||
us they got it.
|
||||
|
||||
(*) max_call_lifetime
|
||||
|
||||
The maximum amount of time in seconds that a call may be in progress
|
||||
before we preemptively kill it.
|
||||
|
||||
(*) dead_call_expiry
|
||||
|
||||
The amount of time in seconds before we remove a dead call from the call
|
||||
list. Dead calls are kept around for a little while for the purpose of
|
||||
repeating ACK and ABORT packets.
|
||||
|
||||
(*) connection_expiry
|
||||
|
||||
The amount of time in seconds after a connection was last used before we
|
||||
remove it from the connection list. Whilst a connection is in existence,
|
||||
it serves as a placeholder for negotiated security; when it is deleted,
|
||||
the security must be renegotiated.
|
||||
|
||||
(*) transport_expiry
|
||||
|
||||
The amount of time in seconds after a transport was last used before we
|
||||
remove it from the transport list. Whilst a transport is in existence, it
|
||||
serves to anchor the peer data and keeps the connection ID counter.
|
||||
|
||||
(*) rxrpc_rx_window_size
|
||||
|
||||
The size of the receive window in packets. This is the maximum number of
|
||||
unconsumed received packets we're willing to hold in memory for any
|
||||
particular call.
|
||||
|
||||
(*) rxrpc_rx_mtu
|
||||
|
||||
The maximum packet MTU size that we're willing to receive in bytes. This
|
||||
indicates to the peer whether we're willing to accept jumbo packets.
|
||||
|
||||
(*) rxrpc_rx_jumbo_max
|
||||
|
||||
The maximum number of packets that we're willing to accept in a jumbo
|
||||
packet. Non-terminal packets in a jumbo packet must contain a four byte
|
||||
header plus exactly 1412 bytes of data. The terminal packet must contain
|
||||
a four byte header plus any amount of data. In any event, a jumbo packet
|
||||
may not exceed rxrpc_rx_mtu in size.
|
||||
141
Documentation/networking/s2io.txt
Normal file
141
Documentation/networking/s2io.txt
Normal file
|
|
@ -0,0 +1,141 @@
|
|||
Release notes for Neterion's (Formerly S2io) Xframe I/II PCI-X 10GbE driver.
|
||||
|
||||
Contents
|
||||
=======
|
||||
- 1. Introduction
|
||||
- 2. Identifying the adapter/interface
|
||||
- 3. Features supported
|
||||
- 4. Command line parameters
|
||||
- 5. Performance suggestions
|
||||
- 6. Available Downloads
|
||||
|
||||
|
||||
1. Introduction:
|
||||
This Linux driver supports Neterion's Xframe I PCI-X 1.0 and
|
||||
Xframe II PCI-X 2.0 adapters. It supports several features
|
||||
such as jumbo frames, MSI/MSI-X, checksum offloads, TSO, UFO and so on.
|
||||
See below for complete list of features.
|
||||
All features are supported for both IPv4 and IPv6.
|
||||
|
||||
2. Identifying the adapter/interface:
|
||||
a. Insert the adapter(s) in your system.
|
||||
b. Build and load driver
|
||||
# insmod s2io.ko
|
||||
c. View log messages
|
||||
# dmesg | tail -40
|
||||
You will see messages similar to:
|
||||
eth3: Neterion Xframe I 10GbE adapter (rev 3), Version 2.0.9.1, Intr type INTA
|
||||
eth4: Neterion Xframe II 10GbE adapter (rev 2), Version 2.0.9.1, Intr type INTA
|
||||
eth4: Device is on 64 bit 133MHz PCIX(M1) bus
|
||||
|
||||
The above messages identify the adapter type(Xframe I/II), adapter revision,
|
||||
driver version, interface name(eth3, eth4), Interrupt type(INTA, MSI, MSI-X).
|
||||
In case of Xframe II, the PCI/PCI-X bus width and frequency are displayed
|
||||
as well.
|
||||
|
||||
To associate an interface with a physical adapter use "ethtool -p <ethX>".
|
||||
The corresponding adapter's LED will blink multiple times.
|
||||
|
||||
3. Features supported:
|
||||
a. Jumbo frames. Xframe I/II supports MTU up to 9600 bytes,
|
||||
modifiable using ifconfig command.
|
||||
|
||||
b. Offloads. Supports checksum offload(TCP/UDP/IP) on transmit
|
||||
and receive, TSO.
|
||||
|
||||
c. Multi-buffer receive mode. Scattering of packet across multiple
|
||||
buffers. Currently driver supports 2-buffer mode which yields
|
||||
significant performance improvement on certain platforms(SGI Altix,
|
||||
IBM xSeries).
|
||||
|
||||
d. MSI/MSI-X. Can be enabled on platforms which support this feature
|
||||
(IA64, Xeon) resulting in noticeable performance improvement(up to 7%
|
||||
on certain platforms).
|
||||
|
||||
e. Statistics. Comprehensive MAC-level and software statistics displayed
|
||||
using "ethtool -S" option.
|
||||
|
||||
f. Multi-FIFO/Ring. Supports up to 8 transmit queues and receive rings,
|
||||
with multiple steering options.
|
||||
|
||||
4. Command line parameters
|
||||
a. tx_fifo_num
|
||||
Number of transmit queues
|
||||
Valid range: 1-8
|
||||
Default: 1
|
||||
|
||||
b. rx_ring_num
|
||||
Number of receive rings
|
||||
Valid range: 1-8
|
||||
Default: 1
|
||||
|
||||
c. tx_fifo_len
|
||||
Size of each transmit queue
|
||||
Valid range: Total length of all queues should not exceed 8192
|
||||
Default: 4096
|
||||
|
||||
d. rx_ring_sz
|
||||
Size of each receive ring(in 4K blocks)
|
||||
Valid range: Limited by memory on system
|
||||
Default: 30
|
||||
|
||||
e. intr_type
|
||||
Specifies interrupt type. Possible values 0(INTA), 2(MSI-X)
|
||||
Valid values: 0, 2
|
||||
Default: 2
|
||||
|
||||
5. Performance suggestions
|
||||
General:
|
||||
a. Set MTU to maximum(9000 for switch setup, 9600 in back-to-back configuration)
|
||||
b. Set TCP windows size to optimal value.
|
||||
For instance, for MTU=1500 a value of 210K has been observed to result in
|
||||
good performance.
|
||||
# sysctl -w net.ipv4.tcp_rmem="210000 210000 210000"
|
||||
# sysctl -w net.ipv4.tcp_wmem="210000 210000 210000"
|
||||
For MTU=9000, TCP window size of 10 MB is recommended.
|
||||
# sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
|
||||
# sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"
|
||||
|
||||
Transmit performance:
|
||||
a. By default, the driver respects BIOS settings for PCI bus parameters.
|
||||
However, you may want to experiment with PCI bus parameters
|
||||
max-split-transactions(MOST) and MMRBC (use setpci command).
|
||||
A MOST value of 2 has been found optimal for Opterons and 3 for Itanium.
|
||||
It could be different for your hardware.
|
||||
Set MMRBC to 4K**.
|
||||
|
||||
For example you can set
|
||||
For opteron
|
||||
#setpci -d 17d5:* 62=1d
|
||||
For Itanium
|
||||
#setpci -d 17d5:* 62=3d
|
||||
|
||||
For detailed description of the PCI registers, please see Xframe User Guide.
|
||||
|
||||
b. Ensure Transmit Checksum offload is enabled. Use ethtool to set/verify this
|
||||
parameter.
|
||||
c. Turn on TSO(using "ethtool -K")
|
||||
# ethtool -K <ethX> tso on
|
||||
|
||||
Receive performance:
|
||||
a. By default, the driver respects BIOS settings for PCI bus parameters.
|
||||
However, you may want to set PCI latency timer to 248.
|
||||
#setpci -d 17d5:* LATENCY_TIMER=f8
|
||||
For detailed description of the PCI registers, please see Xframe User Guide.
|
||||
b. Use 2-buffer mode. This results in large performance boost on
|
||||
certain platforms(eg. SGI Altix, IBM xSeries).
|
||||
c. Ensure Receive Checksum offload is enabled. Use "ethtool -K ethX" command to
|
||||
set/verify this option.
|
||||
d. Enable NAPI feature(in kernel configuration Device Drivers ---> Network
|
||||
device support ---> Ethernet (10000 Mbit) ---> S2IO 10Gbe Xframe NIC) to
|
||||
bring down CPU utilization.
|
||||
|
||||
** For AMD opteron platforms with 8131 chipset, MMRBC=1 and MOST=1 are
|
||||
recommended as safe parameters.
|
||||
For more information, please review the AMD8131 errata at
|
||||
http://vip.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/
|
||||
26310_AMD-8131_HyperTransport_PCI-X_Tunnel_Revision_Guide_rev_3_18.pdf
|
||||
|
||||
6. Support
|
||||
For further support please contact either your 10GbE Xframe NIC vendor (IBM,
|
||||
HP, SGI etc.)
|
||||
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue