Linux 网络协议栈开发代码分析篇之VLAN(一)—— vlan 功能模块分析
整理自:http://blog.csdn.net/lickylin/article/details/41832967
本文代码基于linux3.14
Vlan即虚拟局域网,一个vlan能够模拟一个常规的交换网络,实现了将一个物理的交换机划分成多个逻辑的交换网络。而不同的vlan之间如果要进行通信就要通过三层协议来实现。
在linux中vlan的配置使用vconfig,使用vconfig配置一个交换网络,大致的流程如下:
#
# vconfig add eth0 100
#
# ifconfig eth0.100 up
# brctl addbr br1
# ifconfig br1 up
# brctl addif br1 eth0.100
# brctl addif br1 eth2
这样就实现了将从eth0 口进来的vlan id 为100的数据流与eth2放在了一个逻辑交换网络中。
下面开始分析linux vlan模块
一、相关的数据结构
1.1 struct vlan_ethhdr
包含vlan头部的二层头部结构体
/**
* struct vlan_ethhdr - vlan ethernet header (ethhdr + vlan_hdr)
* @h_dest: destination ethernet address
* @h_source: source ethernet address
* @h_vlan_proto: ethernet protocol
* @h_vlan_TCI: priority and VLAN ID
* @h_vlan_encapsulated_proto: packet type ID or len
*/
struct vlan_ethhdr {
unsigned char h_dest[ETH_ALEN];
unsigned char h_source[ETH_ALEN];
__be16 h_vlan_proto;
__be16 h_vlan_TCI;
__be16 h_vlan_encapsulated_proto;
};
1.2 struct vlan_hdr
vlan头部关联的结构体
/*
* struct vlan_hdr - vlan header
* @h_vlan_TCI: priority and VLAN ID
* @h_vlan_encapsulated_proto: packet type ID or len
*/
struct vlan_hdr {
__be16 h_vlan_TCI;
__be16 h_vlan_encapsulated_proto;
};
1.3 struct vlan_group
该结构体主要是实现将一个real device关联的vlan device设备存放在vlan_devices_arrays,(该变量首先是一个数组,数组里的每一个指针都是一个二级指针)
struct vlan_group {
unsigned int nr_vlan_devs;
struct hlist_node hlist; /* linked list */
struct net_device **vlan_devices_arrays[VLAN_PROTO_NUM]
[VLAN_GROUP_ARRAY_SPLIT_PARTS];
};
该变量也是比较重要的一个数据结构,可以保证一个real device针对每一个vlan id创建的vlan device都是唯一的。
Vlan group是通过全局变量vlan_group_hash的hash链表链接在一起的,其定义如下:
static struct hlist_head vlan_group_hash[VLAN_GRP_HASH_SIZE];
由于vlan device、real device都是使用的struct net_device变量,因此目前代码中,没有将vlan_group定义在net_device中,而是重新定义了一个全局的hash链表数组中。
那么问题来了,可不可以在struct net_device中增加一个链表头指针,将该real device关联的vlan device设备都添加到该链表中呢?
当然是可以的,这样做的话,在查找一个real device关联的所有vlan device时,其查找时间肯定比当前的查找时间要快的(因为不用进行hash相关的操作),但是每一个struct net_device变量都增加了一个链表指针,增加了对内存的使用,这个就需要在内存与性能之间做取舍了。
相应的set与get的接口函数如下:
/*
功能:根据vlan id以及vlan_group变量,查找符合条件的vlan device
*/
static inline struct net_device *__vlan_group_get_device(struct vlan_group *vg,
unsigned int pidx,
u16 vlan_id)
{
struct net_device **array;
array = vg->vlan_devices_arrays[pidx]
[vlan_id / VLAN_GROUP_ARRAY_PART_LEN];
return array ? array[vlan_id % VLAN_GROUP_ARRAY_PART_LEN] : NULL;
}
static inline struct net_device *vlan_group_get_device(struct vlan_group *vg,
__be16 vlan_proto,
u16 vlan_id)
{
return __vlan_group_get_device(vg, vlan_proto_idx(vlan_proto), vlan_id);
}
/*
功能:根据vlan id 以及vlan group变量,在vlan_devices_arrays中增加相应的vlan设备
由于是直接增加设备,没有进行存在性判断,因此在调用该函数之前,
应该调用vlan_group_get_device判断要添加的vlan设备是否已存在。
*/
static inline void vlan_group_set_device(struct vlan_group *vg,
__be16 vlan_proto, u16 vlan_id,
struct net_device *dev)
{
struct net_device **array;
if (!vg)
return;
array = vg->vlan_devices_arrays[vlan_proto_idx(vlan_proto)]
[vlan_id / VLAN_GROUP_ARRAY_PART_LEN];
array[vlan_id % VLAN_GROUP_ARRAY_PART_LEN] = dev;
}
1.4 struct vlan_dev_info
(linux3.14中没有找到该结构体定义)
vlan设备相关的信息结构体
该结构体定义了vlan device相关的一些信息,包括输入/输出方向优先级映射、vlan id、关联的real device、性能统计相关的信息等。
该结构相关的变量被定义在struct net_device的priv指针变量指向的内存空间中。
/**
* struct vlan_dev_info - VLAN private device data
* @nr_ingress_mappings: number of ingress priority mappings
* @ingress_priority_map: ingress priority mappings
* @nr_egress_mappings: number of egress priority mappings
* @egress_priority_map: hash of egress priority mappings
* @vlan_id: VLAN identifier
* @flags: device flags
* @real_dev: underlying netdevice
* @real_dev_addr: address of underlying netdevice
* @dent: proc dir entry
* @cnt_inc_headroom_on_tx: statistic - number of skb expansions on TX
* @cnt_encap_on_xmit: statistic - number of skb encapsulations on TX
*/
struct vlan_dev_info {
unsigned int nr_ingress_mappings;
u32 ingress_priority_map[8];
unsigned int nr_egress_mappings;
struct vlan_priority_tci_mapping *egress_priority_map[16];
u16 vlan_id;
u16 flags;
struct net_device *real_dev;
unsigned char real_dev_addr[ETH_ALEN];
struct proc_dir_entry *dent;
unsigned long cnt_inc_headroom_on_tx;
unsigned long cnt_encap_on_xmit;
};
以上就是主要相关的数据结构,下面分析下相应的功能。
二、linux vlan 架构分析
对于linux vlan架构来说,主要分为四个方面。
1. Struct net_device中相应接口函数的实现,既然vlan device使用的也是struct net_device变量,因此针对vlan device就需要实现相应的接口函数,包括vlan_dev_hard_header、vlan_dev_hard_start_xmit、vlan_dev_open、vlan_dev_stop
2. Linux vlan模块与socket模块的关联,因为linux vlan 模块与应用层是密切相关的,而对于协议栈来说,应用层与kernel层的通信一般使用socket的ioctl来实现,因此就需要在linux vlan 模块实现socket ioctl相关的接口函数,以实现vlan device的添加、删除、优先级的映射等,并在linux socket的ioctl中注册相应的接口函数
3. Linux vlan模块需要实现协议栈处理函数,包括数据包的接收处理以及发送处理等等
4. 作为一个成熟的功能模块,一定要提供debug手段,以实现对linux vlan device debug操作,目前一般是在proc文件系统中注册相应的目录与文件。
下面的内容就以这四个方面进行分析
2.1 struct net_device的接口函数分析
本节就分析一下主要的接口函数,而对于一些次要的接口函数,就不分析了。
2.1.1 vlan_dev_rebuild_header
功能:修改数据包的目的mac地址
主要用于发送方向第一次调用dev->hard_header时,由于目的ip地址没有解析导致目的mac地址没有设置的情况,通过调用该函数实现设置数据包的目的mac地址的功能。
/*
* Rebuild the Ethernet MAC header. This is called after an ARP
* (or in future other address resolution) has completed on this
* sk_buff. We now let ARP fill in the other fields.
*
* This routine CANNOT use cached dst->neigh!
* Really, it is used only when dst->neigh is wrong.
*
* TODO: This needs a checkup, I'm ignorant here. --BLG
*/
static int vlan_dev_rebuild_header(struct sk_buff *skb)
{
struct net_device *dev = skb->dev;
struct vlan_ethhdr *veth = (struct vlan_ethhdr *)(skb->data);
switch (veth->h_vlan_encapsulated_proto) {
#ifdef CONFIG_INET
case htons(ETH_P_IP):
/* TODO: Confirm this will work with VLAN headers... */
return arp_find(veth->h_dest, skb);
#endif
default:
pr_debug("%s: unable to resolve type %X addresses\n",
dev->name, ntohs(veth->h_vlan_encapsulated_proto));
ether_addr_copy(veth->h_source, dev->dev_addr);
break;
}
return 0;
}
2.1.2 vlan_dev_hard_header
功能:创建报文的二层头部信息。
对于本地发送或者转发的数据,其二层的源mac、目的mac地址均需要进行修改的。
1.对于支持reorder header的vlan device而言,此处仅需要增加增加二层头部即可,不需要增加vlan tag,vlan tag是在vlan_dev_hard_start_xmit中添加
2.对于不支持reorder header的vlan device而言,此处在增加二层头部的同时,还需要增加vlan tag,因为在vlan_dev_hard_start_xmit中,不会对该类vlan device增加vlan tag 了。
疑问:如果按照上述的分析,那在vlan_dev_hard_start_xmit应该需要判断vlan_dev->flag是否为1,若不为1,则说明数据包的vlan tag已经添加过了,就不需要再添加vlantag 了,但是在vlan_dev_hard_start_xmit中并没有对vlan_dev->flag的判断啊。这是什么原因呢?
我们注意到在vlan_dev_hard_start_xmit中有判断数据包是否已存在vlan tag,若存在则不再添加vlan tag,因为有了这个判断,所以就没有对vlan_dev->flag进行判断了。
其实vlan_dev_hard_start_xmit中仅仅判断是否带vlan tag来决定是否添加vlantag,本身是有问题的,即没法添加双层vlan tag了。
/*
* Create the VLAN header for an arbitrary protocol layer
*
* saddr=NULL means use device source address
* daddr=NULL means leave destination address (eg unresolved arp)
*
* This is called when the SKB is moving down the stack towards the
* physical devices.
*/
static int vlan_dev_hard_header(struct sk_buff *skb, struct net_device *dev,
unsigned short type,
const void *daddr, const void *saddr,
unsigned int len)
{
struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
struct vlan_hdr *vhdr;
unsigned int vhdrlen = 0;
u16 vlan_tci = 0;
int rc;
if (!(vlan->flags & VLAN_FLAG_REORDER_HDR)) {
vhdr = (struct vlan_hdr *) skb_push(skb, VLAN_HLEN);
vlan_tci = vlan->vlan_id;
vlan_tci |= vlan_dev_get_egress_qos_mask(dev, skb->priority);
vhdr->h_vlan_TCI = htons(vlan_tci);
/*
* Set the protocol type. For a packet of type ETH_P_802_3/2 we
* put the length in here instead.
*/
if (type != ETH_P_802_3 && type != ETH_P_802_2)
vhdr->h_vlan_encapsulated_proto = htons(type);
else
vhdr->h_vlan_encapsulated_proto = htons(len);
skb->protocol = vlan->vlan_proto;
type = ntohs(vlan->vlan_proto);
vhdrlen = VLAN_HLEN;
}
/* Before delegating work to the lower layer, enter our MAC-address */
if (saddr == NULL)
saddr = dev->dev_addr;
/* Now make the underlying real hard header */
dev = vlan->real_dev;
rc = dev_hard_header(skb, dev, type, daddr, saddr, len + vhdrlen);
if (rc > 0)
rc += vhdrlen;
return rc;
}
2.1.3 vlan_dev_hard_start_xmit
功能:vlan device的hard_start_xmit函数(real dev对应的网口不支持硬件vlan的hard_start_xmit函数)
1.当数据包没有携带vlan时,则根据数据包的priority、vlan device的vlanid,组建vlan值,并添加到数据包中
2.增加统计计数,修改数据包的dev指向real device
3.调用dev_queue_xmit,让数据从real device对应的网卡发送出去。
通过该函数可以看出,linux仅对untag的数据进行添加vlan 的操作,当数据包以及包含vlan时,则不会再进行添加vlan的操作。因此若想要支持双层vlan,还是需要修改linux的vlan架构的。
static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
struct net_device *dev)
{
struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
struct vlan_ethhdr *veth = (struct vlan_ethhdr *)(skb->data);
unsigned int len;
int ret;
/* Handle non-VLAN frames if they are sent to us, for example by DHCP.
*
* NOTE: THIS ASSUMES DIX ETHERNET, SPECIFICALLY NOT SUPPORTING
* OTHER THINGS LIKE FDDI/TokenRing/802.3 SNAPs...
*/
if (veth->h_vlan_proto != vlan->vlan_proto ||
vlan->flags & VLAN_FLAG_REORDER_HDR) {
u16 vlan_tci;
vlan_tci = vlan->vlan_id;
vlan_tci |= vlan_dev_get_egress_qos_mask(dev, skb->priority);
skb = __vlan_hwaccel_put_tag(skb, vlan->vlan_proto, vlan_tci);
}
skb->dev = vlan->real_dev;
len = skb->len;
if (unlikely(netpoll_tx_running(dev)))
return vlan_netpoll_send_skb(vlan, skb);
ret = dev_queue_xmit(skb);
if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN)) {
struct vlan_pcpu_stats *stats;
stats = this_cpu_ptr(vlan->vlan_pcpu_stats);
u64_stats_update_begin(&stats->syncp);
stats->tx_packets++;
stats->tx_bytes += len;
u64_stats_update_end(&stats->syncp);
} else {
this_cpu_inc(vlan->vlan_pcpu_stats->tx_dropped);
}
return ret;
}
2.1.4 register_vlan_device
该函数实现注册一个vlan device设备,并实现对struct net_device的相应接口函数进行赋值,实现struct net_device的接口函数指向上述介绍的函数,通过该函数就真正的注册了一个vlan device设备,其二层发送函数就完全是vlan的处理了。这个函数还是比较重要的。
/* Attach a VLAN device to a mac address (ie Ethernet Card).
* Returns 0 if the device was created or a negative error code otherwise.
*/
static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
{
struct net_device *new_dev;
struct vlan_dev_priv *vlan;
struct net *net = dev_net(real_dev);
struct vlan_net *vn = net_generic(net, vlan_net_id);
char name[IFNAMSIZ];
int err;
if (vlan_id >= VLAN_VID_MASK)
return -ERANGE;
err = vlan_check_real_dev(real_dev, htons(ETH_P_8021Q), vlan_id);
if (err < 0)
return err;
/* Gotta set up the fields for the device. */
switch (vn->name_type) {
case VLAN_NAME_TYPE_RAW_PLUS_VID:
/* name will look like: eth1.0005 */
snprintf(name, IFNAMSIZ, "%s.%.4i", real_dev->name, vlan_id);
break;
case VLAN_NAME_TYPE_PLUS_VID_NO_PAD:
/* Put our vlan.VID in the name.
* Name will look like: vlan5
*/
snprintf(name, IFNAMSIZ, "vlan%i", vlan_id);
break;
case VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD:
/* Put our vlan.VID in the name.
* Name will look like: eth0.5
*/
snprintf(name, IFNAMSIZ, "%s.%i", real_dev->name, vlan_id);
break;
case VLAN_NAME_TYPE_PLUS_VID:
/* Put our vlan.VID in the name.
* Name will look like: vlan0005
*/
default:
snprintf(name, IFNAMSIZ, "vlan%.4i", vlan_id);
}
new_dev = alloc_netdev(sizeof(struct vlan_dev_priv), name, vlan_setup);
if (new_dev == NULL)
return -ENOBUFS;
dev_net_set(new_dev, net);
/* need 4 bytes for extra VLAN header info,
* hope the underlying device can handle it.
*/
new_dev->mtu = real_dev->mtu;
new_dev->priv_flags |= (real_dev->priv_flags & IFF_UNICAST_FLT);
vlan = vlan_dev_priv(new_dev);
vlan->vlan_proto = htons(ETH_P_8021Q);
vlan->vlan_id = vlan_id;
vlan->real_dev = real_dev;
vlan->dent = NULL;
vlan->flags = VLAN_FLAG_REORDER_HDR;
new_dev->rtnl_link_ops = &vlan_link_ops;
err = register_vlan_dev(new_dev);
if (err < 0)
goto out_free_newdev;
return 0;
out_free_newdev:
free_netdev(new_dev);
return err;
}
通过以上几个函数,就实现了一个vlan类型的struct net_device变量,创建了一个新的虚拟的网络接口。
2.2 Linux vlan模块与socket模块的关联
本小节分析下vlan模块在socket模块中注册的ioctl,以实现vlan接口的创建与删除等。
2.2.1 vlan_ioctl_set
该函数主要是为函数指针vlan_ioctl_hook赋值,而vlan_ioctl_hook则会socket的ioctl相关的命令下面,用于处理vlan相关的命令
void vlan_ioctl_set(int (*hook) (struct net *, void __user *))
{
mutex_lock(&vlan_ioctl_mutex);
vlan_ioctl_hook = hook;
mutex_unlock(&vlan_ioctl_mutex);
}
在sock_ioctl函数里,针对命令SIOCGIFVLAN、SIOCSIFVLAN,则会调用vlan_ioctl_hook进行相应的处理。
而在vlan_proto_init函数里,通过vlan_ioctl_set(vlan_ioctl_handler),将vlan_ioctl_hook指向了函数vlan_ioctl_handler,这样对于socket的ioctl的SIOCGIFVLAN、SIOCSIFVLAN命令,即会调用函数数vlan_ioctl_handler进行处理
2.2.2 vlan_ioctl_handler
功能:vlan模块相关的ioctl接口函数
/*
* VLAN IOCTL handler.
* o execute requested action or pass command to the device driver
* arg is really a struct vlan_ioctl_args __user *.
*/
static int vlan_ioctl_handler(struct net *net, void __user *arg)
{
int err;
struct vlan_ioctl_args args;
struct net_device *dev = NULL;
if (copy_from_user(&args, arg, sizeof(struct vlan_ioctl_args)))
return -EFAULT;
/* Null terminate this sucker, just in case. */
args.device1[23] = 0;
args.u.device2[23] = 0;
rtnl_lock();
switch (args.cmd) {
case SET_VLAN_INGRESS_PRIORITY_CMD:
case SET_VLAN_EGRESS_PRIORITY_CMD:
case SET_VLAN_FLAG_CMD:
case ADD_VLAN_CMD:
case DEL_VLAN_CMD:
case GET_VLAN_REALDEV_NAME_CMD:
case GET_VLAN_VID_CMD:
err = -ENODEV;
dev = __dev_get_by_name(net, args.device1);
if (!dev)
goto out;
err = -EINVAL;
if (args.cmd != ADD_VLAN_CMD && !is_vlan_dev(dev))
goto out;
}
switch (args.cmd) {
case SET_VLAN_INGRESS_PRIORITY_CMD:
err = -EPERM;
if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
break;
vlan_dev_set_ingress_priority(dev,
args.u.skb_priority,
args.vlan_qos);
err = 0;
break;
case SET_VLAN_EGRESS_PRIORITY_CMD:
err = -EPERM;
if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
break;
err = vlan_dev_set_egress_priority(dev,
args.u.skb_priority,
args.vlan_qos);
break;
case SET_VLAN_FLAG_CMD:
err = -EPERM;
if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
break;
err = vlan_dev_change_flags(dev,
args.vlan_qos ? args.u.flag : 0,
args.u.flag);
break;
case SET_VLAN_NAME_TYPE_CMD:
err = -EPERM;
if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
break;
if ((args.u.name_type >= 0) &&
(args.u.name_type < VLAN_NAME_TYPE_HIGHEST)) {
struct vlan_net *vn;
vn = net_generic(net, vlan_net_id);
vn->name_type = args.u.name_type;
err = 0;
} else {
err = -EINVAL;
}
break;
case ADD_VLAN_CMD:
err = -EPERM;
if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
break;
err = register_vlan_device(dev, args.u.VID);
break;
case DEL_VLAN_CMD:
err = -EPERM;
if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
break;
unregister_vlan_dev(dev, NULL);
err = 0;
break;
case GET_VLAN_REALDEV_NAME_CMD:
err = 0;
vlan_dev_get_realdev_name(dev, args.u.device2);
if (copy_to_user(arg, &args,
sizeof(struct vlan_ioctl_args)))
err = -EFAULT;
break;
case GET_VLAN_VID_CMD:
err = 0;
args.u.VID = vlan_dev_vlan_id(dev);
if (copy_to_user(arg, &args,
sizeof(struct vlan_ioctl_args)))
err = -EFAULT;
break;
default:
err = -EOPNOTSUPP;
break;
}
out:
rtnl_unlock();
return err;
}
这个函数里,有几个函数比较重要,我们需要好好分析下,这几个函数为vlan_dev_set_vlan_flag、register_vlan_device、register_vlan_device、vlan_dev_get_realdev_name、vlan_dev_get_vid、vlan_dev_set_ingress_priority、vlan_dev_set_egress_priority
2.2.2.1vlan_dev_set_vlan_flag
功能:设置vlan device的flag值,即是否支持reorder vlan header。
/* Flags are defined in the vlan_flags enum in include/linux/if_vlan.h file. */
int vlan_dev_change_flags(const struct net_device *dev, u32 flags, u32 mask)
{
struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
u32 old_flags = vlan->flags;
if (mask & ~(VLAN_FLAG_REORDER_HDR | VLAN_FLAG_GVRP |
VLAN_FLAG_LOOSE_BINDING | VLAN_FLAG_MVRP))
return -EINVAL;
vlan->flags = (old_flags & ~mask) | (flags & mask);
if (netif_running(dev) && (vlan->flags ^ old_flags) & VLAN_FLAG_GVRP) {
if (vlan->flags & VLAN_FLAG_GVRP)
vlan_gvrp_request_join(dev);
else
vlan_gvrp_request_leave(dev);
}
if (netif_running(dev) && (vlan->flags ^ old_flags) & VLAN_FLAG_MVRP) {
if (vlan->flags & VLAN_FLAG_MVRP)
vlan_mvrp_request_join(dev);
else
vlan_mvrp_request_leave(dev);
}
return 0;
}
2.2.2.2 vlan_dev_set_ingress_priority
功能:设置vlan优先级与数据包优先级的map
void vlan_dev_set_ingress_priority(const struct net_device *dev,
u32 skb_prio, u16 vlan_prio)
{
struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
if (vlan->ingress_priority_map[vlan_prio & 0x7] && !skb_prio)
vlan->nr_ingress_mappings--;
else if (!vlan->ingress_priority_map[vlan_prio & 0x7] && skb_prio)
vlan->nr_ingress_mappings++;
vlan->ingress_priority_map[vlan_prio & 0x7] = skb_prio;
}
2.2.2.3 vlan_dev_set_egress_priority
功能:设置输出数据包的qos与优先级之间的关联
int vlan_dev_set_egress_priority(const struct net_device *dev,
u32 skb_prio, u16 vlan_prio)
{
struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
struct vlan_priority_tci_mapping *mp = NULL;
struct vlan_priority_tci_mapping *np;
u32 vlan_qos = (vlan_prio << VLAN_PRIO_SHIFT) & VLAN_PRIO_MASK;
/* See if a priority mapping exists.. */
mp = vlan->egress_priority_map[skb_prio & 0xF];
while (mp) {
if (mp->priority == skb_prio) {
if (mp->vlan_qos && !vlan_qos)
vlan->nr_egress_mappings--;
else if (!mp->vlan_qos && vlan_qos)
vlan->nr_egress_mappings++;
mp->vlan_qos = vlan_qos;
return 0;
}
mp = mp->next;
}
/* Create a new mapping then. */
mp = vlan->egress_priority_map[skb_prio & 0xF];
np = kmalloc(sizeof(struct vlan_priority_tci_mapping), GFP_KERNEL);
if (!np)
return -ENOBUFS;
np->next = mp;
np->priority = skb_prio;
np->vlan_qos = vlan_qos;
/* Before inserting this element in hash table, make sure all its fields
* are committed to memory.
* coupled with smp_rmb() in vlan_dev_get_egress_qos_mask()
*/
smp_wmb();
vlan->egress_priority_map[skb_prio & 0xF] = np;
if (vlan_qos)
vlan->nr_egress_mappings++;
return 0;
}
2.2.2.4 vlan_dev_get_realdev_name
功能:根据vlan device的名称,获取其所依附的real device的名称。
void vlan_dev_get_realdev_name(const struct net_device *dev, char *result)
{
strncpy(result, vlan_dev_priv(dev)->real_dev->name, 23);
}
2.2.2.5 vlan_dev_get_vid
功能:查找vlan设备关联的vlan id
u16 vlan_dev_vlan_id(const struct net_device *dev)
{
return vlan_dev_priv(dev)->vlan_id;
}
至此将linux vlan模块与socket的关系分析完了。2.3 Linux vlan模块与协议栈处理函数的关联
既然新开发一个vlan的功能模块,自然需要融入当前的协议栈代码空间中。对于数据流来说也就tx、rx两个方向罢了,tx方向就是vlan device的hard_start_xmit函数,只要在注册struct net_device变量时,对其hard_start_xmit指针进行赋值即可,这个赋值操作在register_vlan_device中已经实现了,且vlan_dev_hard_start_xmit函数分析过了。那还需要考虑接收方向的vlan处理函数了。
那对于vlan的处理放在哪里比较合适呢
如果按协议栈来说的话,那因为vlan头部在二层mac头部后面,如果把其作为一个三层协议对待,像ipv4、ipv6那样,在三层协议相关的回调函数链表中,增加vlan接收处理函数的链表成员即可。
但是,这样做虽然可以正常的剥掉vlan头部,但是vlan协议并不是严格意义上的三层协议啊,如果按这样考虑的话,则把vlan接收处理函数放在netif_receive_skb中进行调用反而会好一点。
那该选用哪一个方式呢?其实上面两种方式也就是linux2.6.x与linux3.x的实现方式。
在linx 2.6的内核里,是通过将dev_add_pack将该接收函数注册到三层协议相关的接收函数的链表里的。即把vlan的接收函数与ip 、ipv6等协议的接收函数注册到同一个链表里的。
但是考虑到vlan毕竟是属于二层协议的范畴,因此在linux3.x中,对剥除vlan tag的操作进行了调整,即在netif_receive_skb中,即调用vlan_untag操作,剥除数据包的vlan tag,接着调用vlan_do_receive修改skb->dev的值,接着重新返回到vlan_untag的起始调用处,即实现了从real_dev->vlan_dev的转换。这样既将vlan的剥除与三层协议相关的接收函数区别开来,又省去了netif_rx的调用。
对于linux2.6.x与linux3.xvlan模块的实现差别,再多说一些;
对于linux3.x以上版本的vlan功能模块,在数据包从vlan dev发送出去时,仅仅设置skb->vlan_tci的值,而不在数据包中增加vlan头部信息。增加vlan头部的操作,是在将skb->dev设置为real dev后,在dev_hard_start_xmit中根据skb->vlan_tci的值来决定是否添加vlan 头部。而linux 3.x以下的版本,则是在vlan dev的dev->netdev_ops->ndo_start_xmit进行vlan tag的添加,然后才修改skb->dev的值为real dev。
在linux 3.x 以下,接收方向是在将skb->dev的值修改为vlan dev后,才剥除vlan tag的;发送方向,则是在添加vlan tag后,才修改skb->dev的值为real dev。
在linux 3.x以上,接收方向的数据包,只要有vlan tag,则会把vlan tag剥除掉,之后再将skb->dev修改成vlan dev;发送方向,则是在将skb->dev的值修改为real dev后,再添加 vlan tag。
一个是在vlan dev处进行vlan tag的剥除与添加,一个策略是在real dev处进行vlan tag的剥除与添加。
下面分析一下函数vlan_skb_recv。
2.3.1 vlan_do_receive
vlan设备的接收处理函数
功能:将数据包携带的vlan头部剥除掉
bool vlan_do_receive(struct sk_buff **skbp)
{
struct sk_buff *skb = *skbp;
__be16 vlan_proto = skb->vlan_proto;
u16 vlan_id = vlan_tx_tag_get_id(skb);
struct net_device *vlan_dev;
struct vlan_pcpu_stats *rx_stats;
vlan_dev = vlan_find_dev(skb->dev, vlan_proto, vlan_id);
if (!vlan_dev)
return false;
skb = *skbp = skb_share_check(skb, GFP_ATOMIC);
if (unlikely(!skb))
return false;
skb->dev = vlan_dev;
if (skb->pkt_type == PACKET_OTHERHOST) {
/* Our lower layer thinks this is not local, let's make sure.
* This allows the VLAN to have a different MAC than the
* underlying device, and still route correctly. */
if (ether_addr_equal(eth_hdr(skb)->h_dest, vlan_dev->dev_addr))
skb->pkt_type = PACKET_HOST;
}
if (!(vlan_dev_priv(vlan_dev)->flags & VLAN_FLAG_REORDER_HDR)) {
unsigned int offset = skb->data - skb_mac_header(skb);
/*
* vlan_insert_tag expect skb->data pointing to mac header.
* So change skb->data before calling it and change back to
* original position later
*/
skb_push(skb, offset);
skb = *skbp = vlan_insert_tag(skb, skb->vlan_proto,
skb->vlan_tci);
if (!skb)
return false;
skb_pull(skb, offset + VLAN_HLEN);
skb_reset_mac_len(skb);
}
skb->priority = vlan_get_ingress_priority(vlan_dev, skb->vlan_tci);
skb->vlan_tci = 0;
rx_stats = this_cpu_ptr(vlan_dev_priv(vlan_dev)->vlan_pcpu_stats);
u64_stats_update_begin(&rx_stats->syncp);
rx_stats->rx_packets++;
rx_stats->rx_bytes += skb->len;
if (skb->pkt_type == PACKET_MULTICAST)
rx_stats->rx_multicast++;
u64_stats_update_end(&rx_stats->syncp);
return true;
}
分析结束
上一篇: 利用GitHub搭建静态网站
下一篇: 如何使用github