Commit graph

27150 commits

Author SHA1 Message Date
Vlad Yasevich
bc9a25d21e bridge: Add vlan support for local fdb entries
When VLAN is added to the port, a local fdb entry for that port
(the entry with the mac address of the port) is added for that
VLAN.  This way we can correctly determine if the traffic
is for the bridge itself.  If the address of the port changes,
we try to change all the local fdb entries we have for that port.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 19:42:16 -05:00
Vlad Yasevich
1690be63a2 bridge: Add vlan support to static neighbors
When a user adds bridge neighbors, allow him to specify VLAN id.
If the VLAN id is not specified, the neighbor will be added
for VLANs currently in the ports filter list.  If no VLANs are
configured on the port, we use vlan 0 and only add 1 entry.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 19:42:16 -05:00
Vlad Yasevich
b0e9a30dd6 bridge: Add vlan id to multicast groups
Add vlan_id to multicasts groups so that we know which vlan
each group belongs to and can correctly forward to appropriate vlan.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 19:42:16 -05:00
Vlad Yasevich
2ba071ecb6 bridge: Add vlan to unicast fdb entries
This patch adds vlan to unicast fdb entries that are created for
learned addresses (not the manually configured ones).  It adds
vlan id into the hash mix and uses vlan as an addditional parameter
for an entry match.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 19:42:15 -05:00
Vlad Yasevich
552406c488 bridge: Add the ability to configure pvid
A user may designate a certain vlan as PVID.  This means that
any ingress frame that does not contain a vlan tag is assigned to
this vlan and any forwarding decisions are made with this vlan in mind.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 19:42:15 -05:00
Vlad Yasevich
7885198861 bridge: Implement vlan ingress/egress policy with PVID.
At ingress, any untagged traffic is assigned to the PVID.
Any tagged traffic is filtered according to membership bitmap.

At egress, if the vlan matches the PVID, the frame is sent
untagged.  Otherwise the frame is sent tagged.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 19:42:15 -05:00
Vlad Yasevich
6cbdceeb1c bridge: Dump vlan information from a bridge port
Using the RTM_GETLINK dump the vlan filter list of a given
bridge port.  The information depends on setting the filter
flag similar to how nic VF info is dumped.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 19:41:46 -05:00
Vlad Yasevich
407af3299e bridge: Add netlink interface to configure vlans on bridge ports
Add a netlink interface to add and remove vlan configuration on bridge port.
The interface uses the RTM_SETLINK message and encodes the vlan
configuration inside the IFLA_AF_SPEC.  It is possble to include multiple
vlans to either add or remove in a single message.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 19:41:46 -05:00
Vlad Yasevich
85f46c6bae bridge: Verify that a vlan is allowed to egress on given port
When bridge forwards a frame, make sure that a frame is allowed
to egress on that port.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 19:41:46 -05:00
Vlad Yasevich
a37b85c9fb bridge: Validate that vlan is permitted on ingress
When a frame arrives on a port or transmitted by the bridge,
if we have VLANs configured, validate that a given VLAN is allowed
to enter the bridge.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 19:41:46 -05:00
Vlad Yasevich
243a2e63f5 bridge: Add vlan filtering infrastructure
Adds an optional infrustructure component to bridge that would allow
native vlan filtering in the bridge.  Each bridge port (as well
as the bridge device) now get a VLAN bitmap.  Each bit in the bitmap
is associated with a vlan id.  This way if the bit corresponding to
the vid is set in the bitmap that the packet with vid is allowed to
enter and exit the port.

Write access the bitmap is protected by RTNL and read access
protected by RCU.

Vlan functionality is disabled by default.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 19:41:46 -05:00
Michal Kubeček
894e2ac82b netfilter: nf_ct_reasm: fix per-netns sysctl initialization
Adjusting of data pointers in net/netfilter/nf_conntrack_frag6_*
sysctl table for other namespaces points to wrong netns_frags
structure and has reversed order of entries.

Problem introduced by commit c038a767cd in 3.7-rc1

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-02-13 20:46:06 +01:00
Daniel Borkmann
2222299748 net: sctp: add build check for sctp_sf_eat_sack_6_2/jsctp_sf_eat_sack
In order to avoid any future surprises of kernel panics due to jprobes
function mismatches (as e.g. fixed in 4cb9d6eaf8: sctp: jsctp_sf_eat_sack:
fix jprobes function signature mismatch), we should check both function
types during build and scream loudly if they do not match. __same_type
resolves to __builtin_types_compatible_p, which is 1 in case both types
are the same and 0 otherwise, qualifiers are ignored. Tested by myself.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 14:07:14 -05:00
Daniel Borkmann
1e55817463 net: sctp: minor: make jsctp_sf_eat_sack static
The function jsctp_sf_eat_sack can be made static, no need to extend
its visibility.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 14:07:14 -05:00
Kees Cook
3bdb1a443a net, sctp: remove CONFIG_EXPERIMENTAL
This config item has not carried much meaning for a while now and is
almost always enabled by default. As agreed during the Linux kernel
summit, remove it.

Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 13:57:27 -05:00
Daniel Borkmann
e9c0dfbaa2 net: sctp: sctp_v6_get_dst: fix boolean test in dst cache
We walk through the bind address list and try to get the best source
address for a given destination. However, currently, we take the
'continue' path of the loop when an entry is invalid (!laddr->valid)
*and* the entry state does not equal SCTP_ADDR_SRC (laddr->state !=
SCTP_ADDR_SRC).

Thus, still, invalid entries with SCTP_ADDR_SRC might not 'continue'
as well as valid entries with SCTP_ADDR_{NEW, SRC, DEL}, with a possible
false baddr and matchlen as a result, causing in worst case dst route
to be false or possibly NULL.

This test should actually be a '||' instead of '&&'. But lets fix it
and make this a bit easier to read by having the condition the same way
as similarly done in sctp_v4_get_dst.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 13:42:34 -05:00
Pau Koning
816cd5b83e batman-adv: Fix NULL pointer dereference in DAT hash collision avoidance
An entry in DAT with the hashed position of 0 can cause a NULL pointer
dereference when the first entry is checked by batadv_choose_next_candidate.
This first candidate automatically has the max value of 0 and the max_orig_node
of NULL. Not checking max_orig_node for NULL in batadv_is_orig_node_eligible
will lead to a NULL pointer dereference when checking for the lowest address.

This problem was added in 785ea11441
("batman-adv: Distributed ARP Table - create DHT helper functions").

Signed-off-by: Pau Koning <paukoning@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 13:35:24 -05:00
Pravin B Shelar
c9af6db4c1 net: Fix possible wrong checksum generation.
Patch cef401de7b (net: fix possible wrong checksum
generation) fixed wrong checksum calculation but it broke TSO by
defining new GSO type but not a netdev feature for that type.
net_gso_ok() would not allow hardware checksum/segmentation
offload of such packets without the feature.

Following patch fixes TSO and wrong checksum. This patch uses
same logic that Eric Dumazet used. Patch introduces new flag
SKBTX_SHARED_FRAG if at least one frag can be modified by
the user. but SKBTX_SHARED_FRAG flag is kept in skb shared
info tx_flags rather than gso_type.

tx_flags is better compared to gso_type since we can have skb with
shared frag without gso packet. It does not link SHARED_FRAG to
GSO, So there is no need to define netdev feature for this.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 13:30:10 -05:00
Andrey Vagin
ee684b6f28 tcp: send packets with a socket timestamp
A socket timestamp is a sum of the global tcp_time_stamp and
a per-socket offset.

A socket offset is added in places where externally visible
tcp timestamp option is parsed/initialized.

Connections in the SYN_RECV state are not supported, global
tcp_time_stamp is used for them, because repair mode doesn't support
this state. In a future it can be implemented by the similar way
as for TIME_WAIT sockets.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 13:22:16 -05:00
Andrey Vagin
93be6ce0e9 tcp: set and get per-socket timestamp
A timestamp can be set, only if a socket is in the repair mode.

This patch adds a new socket option TCP_TIMESTAMP, which allows to
get and set current tcp times stamp.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 13:22:15 -05:00
Andrey Vagin
ceaa1fef65 tcp: adding a per-socket timestamp offset
This functionality is used for restoring tcp sockets. A tcp timestamp
depends on how long a system has been running, so it's differ for each
host. The solution is to set a per-socket offset.

A per-socket offset for a TIME_WAIT socket is inherited from a proper
tcp socket.

tcp_request_sock doesn't have a timestamp offset, because the repair
mode for them are not implemented.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 13:22:15 -05:00
Neil Horman
959d5fdee7 netpoll: fix smatch warnings in netpoll core code
Dan Carpenter contacted me with some notes regarding some smatch warnings in the
netpoll code, some of which I introduced with my recent netpoll locking fixes,
some which were there prior.   Specifically they were:

net-next/net/core/netpoll.c:243 netpoll_poll_dev() warn: inconsistent
  returns mutex:&ni->dev_lock: locked (213,217) unlocked (210,243)
net-next/net/core/netpoll.c:706 netpoll_neigh_reply() warn: potential
  pointer math issue ('skb_transport_header(send_skb)' is a 128 bit pointer)

This patch corrects the locking imbalance (the first error), and adds some
parenthesis to correct the second error.  Tested by myself. Applies to net-next

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Dan Carpenter <dan.carpenter@oracle.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 11:56:46 -05:00
James Hogan
99d5851eef net: skbuff: fix compile error in skb_panic()
I get the following build error on next-20130213 due to the following
commit:

commit f05de73bf8 ("skbuff: create
skb_panic() function and its wrappers").

It adds an argument called panic to a function that uses the BUG() macro
which tries to call panic, but the argument masks the panic() function
declaration, resulting in the following error (gcc 4.2.4):

net/core/skbuff.c In function 'skb_panic':
net/core/skbuff.c +126 : error: called object 'panic' is not a function

This is fixed by renaming the argument to msg.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Jean Sacren <sakiwit@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13 11:52:04 -05:00
Eric W. Biederman
f025adf191 sunrpc: Properly decode kuids and kgids in RPC_AUTH_UNIX credentials
When reading kuids from the wire map them into the initial user
namespace, and validate the mapping succeded.

When reading kgids from the wire map them into the initial user
namespace, and validate the mapping succeded.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:26 -08:00
Eric W. Biederman
25da926371 sunrpc: Properly encode kuids and kgids in auth.unix.gid rpc pipe upcalls.
When a new rpc connection is established with an in-kernel server, the
traffic passes through svc_process_common, and svc_set_client and down
into svcauth_unix_set_client if it is of type RPC_AUTH_NULL or
RPC_AUTH_UNIX.

svcauth_unix_set_client then looks at the uid of the credential we
have assigned to the incomming client and if we don't have the groups
already cached makes an upcall to get a list of groups that the client
can use.

The upcall encodes send a rpc message to user space encoding the uid
of the user whose groups we want to know.  Encode the kuid of the user
in the initial user namespace as nfs mounts can only happen today in
the initial user namespace.

When a reply to an upcall comes in convert interpret the uid and gid values
from the rpc pipe as uids and gids in the initial user namespace and convert
them into kuids and kgids before processing them further.

When reading proc files listing the uid to gid list cache convert the
kuids and kgids from into uids and gids the initial user namespace.  As we are
displaying server internal details it makes sense to display these values
from the servers perspective.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:25 -08:00
Eric W. Biederman
a570abbb96 sunrpc: Properly encode kuids and kgids in RPC_AUTH_UNIX credentials
When writing kuids onto the wire first map them into the initial user
namespace.

When writing kgids onto the wire first map them into the initial user
namespace.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:24 -08:00
Eric W. Biederman
9e469e30d7 sunrpc: Hash uids by first computing their value in the initial userns
In svcauth_unix introduce a helper unix_gid_hash as otherwise the
expresion to generate the hash value is just too long.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:23 -08:00
Eric W. Biederman
683428fae8 sunrpc: Update svcgss xdr handle to rpsec_contect cache
For each received uid call make_kuid and validate the result.
For each received gid call make_kgid and validate the result.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:22 -08:00
Eric W. Biederman
90602c7b19 sunrpc: Update gss uid to security context mapping.
- Use from_kuid when generating the on the wire uid values.
- Use make_kuid when reading on the wire values.

In gss_encode_v0_msg, since the uid in gss_upcall_msg is now a kuid_t
generate the necessary uid_t value on the stack copy it into
gss_msg->databuf where it can safely live until the message is no
longer needed.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:21 -08:00
Eric W. Biederman
e572fc7398 sunrpc: Use gid_valid to test for gid != INVALID_GID
In auth unix there are a couple of places INVALID_GID is used a
sentinel to mark the end of uc_gids array.  Use gid_valid
as a type safe way to verify we have not hit the end of
valid data in the array.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:20 -08:00
Eric W. Biederman
cdba321e29 sunrpc: Convert kuids and kgids to uids and gids for printing
When printing kuids and kgids for debugging purpropses convert them
to ordinary integers so their values can be fed to the oridnary
print functions.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:19 -08:00
Eric W. Biederman
9132adb021 sunrpc: Simplify auth_unix now that everything is a kgid_t
In unx_create_cred directly assign gids from acred->group_info
to cred->uc_gids.

In unx_match directly compare uc_gids with group_info.

Now that both group_info and unx_cred gids are stored as kgids
this is valid and the extra layer of translation can be removed.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:18 -08:00
Eric W. Biederman
0b4d51b02a sunrpc: Use uid_eq and gid_eq where appropriate
When comparing uids use uid_eq instead of ==.
When comparing gids use gid_eq instead of ==.

And unfortunate cost of type safety.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:17 -08:00
Eric W. Biederman
7eaf040b72 sunrpc: Use kuid_t and kgid_t where appropriate
Convert variables that store uids and gids to be of type
kuid_t and kgid_t instead of type uid_t and gid_t.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:16 -08:00
Eric W. Biederman
bf37f79437 sunrpc: Use userns friendly constants.
Instead of (uid_t)0 use GLOBAL_ROOT_UID.
Instead of (gid_t)0 use GLOBAL_ROOT_GID.
Instead of (uid_t)-1 use INVALID_UID
Instead of (gid_t)-1 use INVALID_GID.
Instead of NOGROUP use INVALID_GID.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:15 -08:00
Johannes Berg
2a0e047ed6 cfg80211: configuration for WoWLAN over TCP
Intel Wireless devices are able to make a TCP connection
after suspending, sending some data and waking up when
the connection receives wakeup data (or breaks). Add the
WoWLAN configuration and feature advertising API for it.

Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-02-13 14:33:42 +01:00
Felix Fietkau
a0497f9f57 mac80211/minstrel_ht: add support for using CCK rates
When MCS rates start to get bad in 2.4 GHz because of long range or
strong interference, CCK rates can be a lot more robust.

This patch adds a pseudo MCS group containing CCK rates (long preamble
in the lower 4 slots, short preamble in the upper slots).

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
[make minstrel_ht_get_stats static]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-02-13 10:56:33 +01:00
Luciano Coelho
6719429dd6 cfg80211: check vendor IE length to avoid overrun
cfg80211_find_vendor_ie() was checking only that the vendor IE would
fit in the remaining IEs buffer.  If a corrupt includes a vendor IE
that is too small, we could potentially overrun the IEs buffer.

Fix this by checking that the vendor IE fits in the reported IE length
field and skip it otherwise.

Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Luciano Coelho <coelho@ti.com>
[change BUILD_BUG_ON to != 1 (from >= 2)]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-02-13 10:14:17 +01:00
Amitkumar Karwar
bb92d19983 nl80211: add packet offset information for wowlan pattern
If user knows the location of a wowlan pattern to be matched in
Rx packet, he can provide an offset with the pattern. This will
help drivers to ignore initial bytes and match the pattern
efficiently.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
[refactor pattern sending]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-02-13 10:09:48 +01:00
Jiri Pirko
c6d14ff11b act_police: improved accuracy at high rates
Current act_police uses rate table computed by the "tc" userspace
program, which has the following issue:

The rate table has 256 entries to map packet lengths to token (time
units).  With TSO sized packets, the 256 entry granularity leads to
loss/gain of rate, making the token bucket inaccurate.

Thus, instead of relying on rate table, this patch explicitly computes
the time and accounts for packet transmission times with nanosecond
granularity.

This is a followup to 56b765b79e
("htb: improved accuracy at high rates").

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-12 18:59:45 -05:00
Jiri Pirko
0e243218ba act_police: move struct tcf_police to act_police.c
It's not used anywhere else, so move it.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-12 18:59:45 -05:00
Jiri Pirko
b757c9336d tbf: improved accuracy at high rates
Current TBF uses rate table computed by the "tc" userspace program,
which has the following issue:

The rate table has 256 entries to map packet lengths to
token (time units).  With TSO sized packets, the 256 entry granularity
leads to loss/gain of rate, making the token bucket inaccurate.

Thus, instead of relying on rate table, this patch explicitly computes
the time and accounts for packet transmission times with nanosecond
granularity.

This is a followup to 56b765b79e
("htb: improved accuracy at high rates").

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-12 18:59:45 -05:00
Jiri Pirko
34c5d292ce sch_api: introduce qdisc_watchdog_schedule_ns()
tbf will need to schedule watchdog in ns. No need to convert it twice.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-12 18:59:45 -05:00
Jiri Pirko
292f1c7ff6 sch: make htb_rate_cfg and functions around that generic
As it is going to be used in tbf as well, push these to generic code.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-12 18:59:45 -05:00
Jiri Pirko
b9a7afdefd htb: initialize cl->tokens and cl->ctokens correctly
These are in ns so convert from ticks to ns.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-12 18:59:44 -05:00
Jiri Pirko
bdd6998b1e htb: remove pointless first initialization of buffer and cbuffer
These are initialized correctly a couple of lines later in the
function.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-12 18:59:44 -05:00
Jiri Pirko
324f5aa528 htb: use PSCHED_TICKS2NS()
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-12 18:59:44 -05:00
David S. Miller
9f6d98c298 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c

The bnx2x gso_type setting bug fix in 'net' conflicted with
changes in 'net-next' that broke the gso_* setting logic
out into a seperate function, which also fixes the bug in
question.  Thus, use the 'net-next' version.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-12 18:58:28 -05:00
Jiri Pirko
9c10f4115c htb: fix values in opt dump
in htb_change_class() cl->buffer and cl->buffer are stored in ns.
So in dump, convert them back to psched ticks.

Note this was introduced by:
commit 56b765b79e
    htb: improved accuracy at high rates

Please consider this for -net/-stable.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-12 18:51:11 -05:00
Florian Westphal
6e2f0aa8cf netfilter: nf_ct_helper: don't discard helper if it is actually the same
commit (32f5376 netfilter: nf_ct_helper: disable automatic helper
re-assignment of different type) broke transparent proxy scenarios.

For example, initial helper lookup might yield "ftp" (dport 21),
while re-lookup after REDIRECT yields "ftp-2121".

This causes the autoassign code to toss the ftp helper, even
though these are just different instances of the same helper.

Change the test to check for the helper function address instead
of the helper address, as suggested by Pablo.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@gnumonks.org>
2013-02-12 23:20:46 +01:00