This change also includes:
- Refactoring of gNMI protocol+driver to take advantage of the recent
changes to the gRPC protocol subsystem (e.g. no more locking, start RPC
with timeouts, etc.).
- Fixed Stratum driver to work after GeneralDeviceProvider refactoring
- Updated bmv2.py to generate ChassisConfig for stratum_bmv2
- Fixed portstate command to use the same port name as in the store
Change-Id: I0dad3bc73e4b6d907b5cf6b7b9a2852943226be7
This (big) change aims at solving the issue observed with mastership flapping
and device connection/disconnection with P4Runtime.
Channel handling is now based on the underlying gRPC channel state. Before,
channel events (open/close/error) were generated as a consequence of P4Runtime
StreamChannel events, making device availability dependent on mastership. Now
Stream Channel events only affect mastership (MASTER/STANDBY or NONE when the
SteamChannel RPC is not active).
Mastership handling has been refactored to generate P4Runtime election IDs that
are compatible with the mastership preference decided by the MastershipService.
GeneralDeviceProvider has been re-implemented to support in-order
device event processing and to reduce implementation complexity. Stats polling
has been moved to a separate component, and netcfg handling updated to only
depend on BasicDeviceConfig, augmented with a pipeconf field, and re-using the
managementAddress field to set the gRPC server endpoints (e.g.
grpc://myswitch.local:50051). Before it was depending on 3 different config
classes, making hard to detect changes.
Finally, this change affects some core interfaces:
- Adds a method to DeviceProvider and DeviceHandshaker to check for device
availability, making the meaning of availability device-specific. This is needed
in cases where the device manager needs to change the availability state of a
device (as in change #20842)
- Support device providers not capable of reconciling mastership role responses
with requests (like P4Runtime).
- Clarify the meaning of "connection" in the DeviceConnect behavior.
- Allows driver-based providers to check devices for reachability and
availability without probing the device via the network.
Change-Id: I7ff30d29f5d02ad938e3171536e54ae2916629a2
- Write calls to file, one file for each channel
- Log started, inbound msg, outbound msg, and closed events, for each RPC
- Distinguish between different RPCs by assigning an ID to each one
Also, removed redundant DeviceId attribute from GrpcChannelId, as all
channel IDs were already created using a client key that contains the
DeviceID. It seems a better approach to not restrict the definition of a
channel ID and have that defined simply as a string.
Change-Id: I9d88e528218a5689d6835c9b48022119976b6c5a
The new client API supports batching and provides detailed response for
write requests (e.g. if entity already exists when inserting), which was
not possible with the old one.
This patch includes:
- New more efficient implementation of P4RuntimeClient (no more locking,
use native gRPC executor, use stub deadlines)
- Ported all codecs to new AbstractCodec-based implementation (needed to
implement codec cache in the future)
- Uses batching in P4RuntimeFlowRuleProgrammable and
P4RuntimeGroupActionProgrammable
- Minor changes to PI framework runtime classes
Change-Id: I3fac42057bb4e1389d761006a32600c786598683
- Bumped version of protobuf to 3.6.1.3 (includes fix for Bazel 0.22)
- Removed all protobuf and grpc dependencies from deps.json. Instead,
depends solely on what's provided by the external grpc and protobuf
workspaces.
- Use OSGi-wrapped protobuf and grpc JARs built with Bazel for runtime
- Add missing netty-related bundles to onos-thirdparty-base (required
by grpc)
Note, build with Bazel 0.22 is still broken because of
osgi_java_library.bzl, unless the following build arg is used:
build --incompatible_string_is_not_iterable=false
It seems the error is caused by dead code in osgi_java_library.bzl
that should be removed.
Change-Id: I749f1de25902bf9df5242444380f7224bc99b4b5
Includes also other minor changes to gRPC channel creation/connection
process, such as:
- More compact logs showing the gRPC client key
- GrpcChannelController.connectChannel() now returns the same
StatusRuntime exception, no need to wrap it in an IOException
- Wait for channel shutdown after initial connection error
Change-Id: Ib7d2b728b8c82d9f9b2097cffcebd31cac891b27
created netty specific feature separate from third-party-base
refactored boot features to ensure proper boot sequence for netty, sshd, and core-net
moved http codec to netty feature
Change-Id: Ie6e0ce14fba71603086b7cfe62e1c90a77fd18f2
Co-authored-by: Ray Milkey <ray@opennetworking.org>
A convenient macro for packaging together all proto and gRPC libraries
in an OSGi jar is provided. Also re-packaging of gRPC core (to avoid OSGi
split problem) is simplified by depending on a patched fork of grpc-java.
Change-Id: Idb79a5bea8ae0bc57b146bda1fc47a4568d12c60
Most notably, we fix a bug in which some nodes were not able to find
pipeconf-specific behaviors for a given device. The problem is not
completelly solved but it's mitigated.
There's a race condition caused by the fact that the GDP updates the cfg
with the merged driver name before advertising the device to the core.
Some nodes might receive the cfg update after the device has been
advertised. We mitigate the problem by performing the pipeline deploy
(slow operation) after the cfg update, giving more time for nodes
to catch up. Perhaps we should listen for cfg update events before
advertising the device to the core?
Also:
- NPE when getting P4Runtime client
- Detect if a base driver is already merged in pipeconf manager
- Longer timeouts in P4Runtime driver and protocol (for slow networks)
- Configurable timeout in P4Runtime driver and GDP
- NPE when adding/removing device agent listeners in P4Rtunime handshaker
- Various exceptions due to race conditions in GDP when disconnecting
devices (by serializing disconnect tasks per device)
- NPE when cancelling polling tasks in GDP
- Refactored PipeconfService to distinguish between driver merge,
pipeconf map update, and cfg update (now performed in the GDP)
- Fixed PipeconfManagerTest, not testing driver behaviours
- Use Guava striped locks when possible (more memory-efficient than maps,
and with strict atomicity guarantees w.r.t. to caches).
Change-Id: I30f3887541ba0fd44439a86885e9821ac565b64c