Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nodes pre-provisioning #127

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion src/main/java/org/jenkinsci/plugins/vSphereCloud.java
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
import org.jenkinsci.plugins.folder.FolderVSphereCloudProperty;
import org.jenkinsci.plugins.vsphere.VSphereConnectionConfig;
import org.jenkinsci.plugins.vsphere.tools.*;
import org.kohsuke.accmod.Restricted;
import org.kohsuke.accmod.restrictions.NoExternalUse;
import org.kohsuke.stapler.DataBoundConstructor;
import org.kohsuke.stapler.QueryParameter;
import org.kohsuke.stapler.Stapler;
Expand Down Expand Up @@ -215,7 +217,8 @@ public List<? extends vSphereCloudSlaveTemplate> getTemplates() {
return this.templates;
}

private vSphereCloudSlaveTemplate getTemplateForVM(final String vmName) {
@Restricted(NoExternalUse.class)
public vSphereCloudSlaveTemplate getTemplateForVM(final String vmName) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be public?
Could we get away with just default (package-level) visibility instead, i.e. vSphereCloudSlaveTemplate getTemplateForVM(final String vmName) { rather than public vSphereCloudSlaveTemplate getTemplateForVM(final String vmName) { ?

(we'll need the @Restricted as it's not private but package-level, or protected, would be preferred over public)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think that) This method is only called from elsewhere within this class and the neighbouring vSphereCloudSlaveTemplate class; they're both in the same package.
I believe, therefore, that it doesn't need to be public and that it can be set to default visibility.
i.e.

    @Restricted(NoExternalUse.class)
    vSphereCloudSlaveTemplate getTemplateForVM(final String vmName) {

if (this.templates == null || vmName == null)
return null;
for (final vSphereCloudSlaveTemplate t : this.templates) {
Expand Down
6 changes: 6 additions & 0 deletions src/main/java/org/jenkinsci/plugins/vSphereCloudSlave.java
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,12 @@ public Integer getLimitedTestRunCount() {
return LimitedTestRunCount;
}

@Restricted(NoExternalUse.class)
public vSphereCloudSlaveTemplate getTemplate() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be public?
Could we get away with just default (package-level) visibility instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that this method is used. I'm looking using my browser on github, and it says that "getTemplate(" only occurs here, where it's declared, and nowhere else in this PR.
If that's the case, it can/should be removed.

vSphereCloud cloud = findOurVsInstance();
return cloud.getTemplateForVM(getVmName());
}

public boolean isLaunchSupportForced() {
return ((vSphereCloudLauncher) getLauncher()).getOverrideLaunchSupported() == Boolean.TRUE;
}
Expand Down
17 changes: 17 additions & 0 deletions src/main/java/org/jenkinsci/plugins/vSphereCloudSlaveComputer.java
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,10 @@

import java.io.PrintWriter;
import java.io.StringWriter;
import java.util.ArrayList;
import java.util.List;

import javax.annotation.Nonnull;
import org.jenkinsci.plugins.vsphere.tools.VSphere;

import com.vmware.vim25.VirtualHardware;
Expand All @@ -14,8 +17,10 @@
import com.vmware.vim25.mo.ManagedEntity;
import com.vmware.vim25.mo.VirtualMachine;

import hudson.model.Computer;
import hudson.slaves.AbstractCloudComputer;
import hudson.slaves.AbstractCloudSlave;
import jenkins.model.Jenkins;

public class vSphereCloudSlaveComputer extends AbstractCloudComputer {
private final vSphereCloudSlave vSlave;
Expand Down Expand Up @@ -77,6 +82,18 @@ public String getVmInformationError() {
return getVMInformation().errorEncounteredWhenDataWasRead;
}

/**
* Get all computers.
korablin marked this conversation as resolved.
Show resolved Hide resolved
*/
protected static @Nonnull List<vSphereCloudSlaveComputer> getAll() {
korablin marked this conversation as resolved.
Show resolved Hide resolved
ArrayList<vSphereCloudSlaveComputer> out = new ArrayList<>();
for (final Computer c : Jenkins.get().getComputers()) {
if (!(c instanceof vSphereCloudSlaveComputer)) continue;
out.add((vSphereCloudSlaveComputer) c);
}
return out;
}

/** 10 seconds */
private static final long NANOSECONDS_TO_CACHE_VMINFORMATION = 10L * 1000000000L;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@
import java.util.Set;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.stream.Collectors;

import javax.annotation.Nonnull;

Expand Down Expand Up @@ -114,6 +113,7 @@ public class vSphereCloudSlaveTemplate implements Describable<vSphereCloudSlaveT
private final boolean saveFailure;
private final String targetResourcePool;
private final String targetHost;
private final int instancesMin;
/**
* Credentials from old configuration format. Credentials are now in the
* {@link #launcher} configuration
Expand Down Expand Up @@ -152,6 +152,7 @@ public vSphereCloudSlaveTemplate(final String cloneNamePrefix,
final boolean saveFailure,
final String targetResourcePool,
final String targetHost,
final int instancesMin,
final String credentialsId /*deprecated*/,
final ComputerLauncher launcher,
final RetentionStrategy<?> retentionStrategy,
Expand Down Expand Up @@ -181,6 +182,7 @@ public vSphereCloudSlaveTemplate(final String cloneNamePrefix,
this.saveFailure = saveFailure;
this.targetResourcePool = targetResourcePool;
this.targetHost = targetHost;
this.instancesMin = instancesMin;
this.credentialsId = credentialsId;
this.nodeProperties = Util.fixNull(nodeProperties);
this.guestInfoProperties = Util.fixNull(guestInfoProperties);
Expand Down Expand Up @@ -272,6 +274,10 @@ public int getLimitedRunCount() {
return this.limitedRunCount;
}

public int getInstancesMin() {
return this.instancesMin;
}

public boolean getSaveFailure() {
return this.saveFailure;
}
Expand Down Expand Up @@ -319,6 +325,52 @@ public RetentionStrategy<?> getRetentionStrategy() {
return this.retentionStrategy;
}

/**
* Return a list of running nodes provisioned using this template.
*/
@Restricted(NoExternalUse.class)
public List<vSphereCloudSlaveComputer> getOnlineNodes() {
pjdarton marked this conversation as resolved.
Show resolved Hide resolved
return getNodes(false, false);
}

/**
* Return a list of idle nodes provisioned using this template.
*/
@Restricted(NoExternalUse.class)
public List<vSphereCloudSlaveComputer> getIdleNodes() {
pjdarton marked this conversation as resolved.
Show resolved Hide resolved
return getNodes(true, false);
}

/**
* Return a list of busy nodes provisioned using this template
* that can be reused.
*/
@Restricted(NoExternalUse.class)
public List<vSphereCloudSlaveComputer> getBusyReusableNodes() {
korablin marked this conversation as resolved.
Show resolved Hide resolved
return getNodes(false, true);
}

private List<vSphereCloudSlaveComputer> getNodes(Boolean idle, Boolean reusable) {
korablin marked this conversation as resolved.
Show resolved Hide resolved
List<vSphereCloudSlaveComputer> nodes = new ArrayList<>();
for (vSphereCloudSlaveComputer node : vSphereCloudSlaveComputer.getAll()) {
if (!node.isOnline()) continue;
if (idle && !node.isIdle()) continue;
// only busy nodes are counted as reusable
if (reusable && node.isIdle()) continue;
String vmName = node.getName();
vSphereCloudSlaveTemplate nodeTemplate = getParent().getTemplateForVM(vmName);
// Filter out nodes from other clouds: nodeTemplate is null for these.
if (nodeTemplate == null) continue;
if (getLabelString() != nodeTemplate.getLabelString()) continue;
if (reusable) {
RetentionStrategy<?> nodeStrategy = nodeTemplate.getRetentionStrategy();
if (nodeStrategy instanceof RunOnceCloudRetentionStrategy) continue;
}
nodes.add(node);
}
return nodes;
}

protected Object readResolve() {
this.labelSet = Label.parse(labelString);
if(this.templateInstanceCap == 0) {
Expand Down Expand Up @@ -550,6 +602,10 @@ public FormValidation doCheckLimitedRunCount(@QueryParameter String limitedRunCo
return FormValidation.validateNonNegativeInteger(limitedRunCount);
}

public FormValidation doCheckInstancesMin(@QueryParameter String instancesMin) {
return FormValidation.validateNonNegativeInteger(instancesMin);
}

public FormValidation doCheckTemplateInstanceCap(@QueryParameter String templateInstanceCap) {
return FormValidation.validateNonNegativeInteger(templateInstanceCap);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,10 @@ public long check(final AbstractCloudComputer c) {
if (c.isIdle() && !disabled) {
final long idleMilliseconds = System.currentTimeMillis() - c.getIdleStartMilliseconds();
if (idleMilliseconds > TimeUnit.MINUTES.toMillis(idleMinutes)) {
if (VSphereNodeReconcileWork.shouldNodeBeRetained(c)) {
LOGGER.log(Level.FINE, "Keeping {0} to meet minimum requirements", c.getName());
korablin marked this conversation as resolved.
Show resolved Hide resolved
return 1;
korablin marked this conversation as resolved.
Show resolved Hide resolved
}
LOGGER.log(
Level.FINE,
"Disconnecting {0} because it has been idle for more than {1} minutes (has been idle for {2}ms)",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,26 @@
import hudson.Extension;
import hudson.model.Descriptor;
import hudson.model.DescriptorVisibilityFilter;
import hudson.slaves.AbstractCloudComputer;
import hudson.slaves.AbstractCloudSlave;
import hudson.slaves.CloudRetentionStrategy;
import hudson.slaves.RetentionStrategy;

import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;

import static java.util.concurrent.TimeUnit.*;
import javax.annotation.concurrent.GuardedBy;

import org.kohsuke.accmod.Restricted;
import org.kohsuke.accmod.restrictions.NoExternalUse;
import org.kohsuke.stapler.DataBoundConstructor;

public class VSphereCloudRetentionStrategy extends CloudRetentionStrategy {

private static final Logger LOGGER = Logger.getLogger(VSphereCloudRetentionStrategy.class.getName());

private final int idleMinutes;

@DataBoundConstructor
Expand All @@ -26,6 +37,16 @@ public int getIdleMinutes() {
return idleMinutes;
}

@Override
@GuardedBy("hudson.model.Queue.lock")
public long check(final AbstractCloudComputer c) {
if (VSphereNodeReconcileWork.shouldNodeBeRetained(c)) {
LOGGER.log(Level.FINE, "Keeping {0} to meet minimum requirements", c.getName());
korablin marked this conversation as resolved.
Show resolved Hide resolved
return 1;
}
return super.check(c);
}

@Override
public DescriptorImpl getDescriptor() {
return DESCRIPTOR;
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
package org.jenkinsci.plugins.vsphere;

import hudson.Extension;
import hudson.Functions;
import hudson.model.TaskListener;
import hudson.model.AsyncPeriodicWork;
import hudson.slaves.AbstractCloudComputer;
import hudson.slaves.AbstractCloudSlave;
import hudson.slaves.Cloud;
import hudson.model.Label;
import jenkins.model.Jenkins;
import org.jenkinsci.plugins.vSphereCloud;
import org.jenkinsci.plugins.vSphereCloudSlave;
import org.jenkinsci.plugins.vSphereCloudSlaveComputer;
import org.jenkinsci.plugins.vSphereCloudSlaveTemplate;

import java.io.IOException;
import java.util.List;
import java.util.logging.Level;
import java.util.logging.Logger;

import org.kohsuke.accmod.Restricted;
import org.kohsuke.accmod.restrictions.NoExternalUse;

/**
* A {@link AsyncPeriodicWork} that reconciles nodes to meet template values.
* <p>
* The async work will check the number of deployed nodes and provision (or
* delete) additional ones to meet template values. The check is happening every
* 2 minutes.
*/
@Extension
@Restricted(NoExternalUse.class)
public final class VSphereNodeReconcileWork extends AsyncPeriodicWork {
private static final Logger LOGGER = Logger.getLogger(VSphereNodeReconcileWork.class.getName());

public VSphereNodeReconcileWork() {
super("Vsphere nodes reconciliation");
}

@Override
public long getRecurrencePeriod() {
return Functions.getIsUnitTest() ? Long.MAX_VALUE : MIN * 2;
}

@Override
public void execute(TaskListener listener) {
for (Cloud cloud : Jenkins.getActiveInstance().clouds) {
if (!(cloud instanceof vSphereCloud)) continue;
for (vSphereCloudSlaveTemplate template : ((vSphereCloud) cloud).getTemplates()) {
String templateLabel = template.getLabelString();
Label label = Label.get(templateLabel);

int instancesMin = template.getInstancesMin();
List<vSphereCloudSlaveComputer> idleNodes = template.getIdleNodes();
List<vSphereCloudSlaveComputer> runningNodes = template.getOnlineNodes();
List<vSphereCloudSlaveComputer> reusableBusyNodes = template.getBusyReusableNodes();
// Get max number of nodes that could be provisioned
int globalMaxNodes = ((vSphereCloud) cloud).getInstanceCap();
int templateMaxNodes = template.getTemplateInstanceCap();
int maxNodes = Math.min(globalMaxNodes, templateMaxNodes);

// if maxNumber is lower than instancesMin, we have to ignore instancesMin
int toProvision = Math.min(instancesMin - (reusableBusyNodes.size() + idleNodes.size()),
maxNodes - runningNodes.size());
if (toProvision > 0) {
// provision desired number of nodes for this label
LOGGER.log(Level.INFO, "Pre-creating {0} instance(s) for template {1} in cloud {3}",
new Object[] { toProvision, templateLabel, cloud.name });
try {
cloud.provision(label, toProvision);
korablin marked this conversation as resolved.
Show resolved Hide resolved
} catch (Throwable ex) {
LOGGER.log(Level.SEVERE, "Failed to pre-create instance from template {0}. Exception: {1}",
new Object[] { templateLabel, ex });
}
} else if (toProvision < 0) {
korablin marked this conversation as resolved.
Show resolved Hide resolved
int toDelete = Math.min(idleNodes.size(), Math.abs(toProvision));
for (int i = 0; i < toDelete; i++) {
AbstractCloudSlave node = idleNodes.get(i).getNode();
korablin marked this conversation as resolved.
Show resolved Hide resolved
if (node == null) continue;
LOGGER.log(Level.INFO, "Found excessive instance. Terminating {0} node {1}.",
new Object[] { idleNodes.get(i).getName(), node });
try {
node.terminate();
} catch (InterruptedException | IOException e) {
LOGGER.log(Level.WARNING, e.getMessage());
// try to delete it later
continue;
}
}
}
}
}
}

/**
* Should a node be retained to meet the minimum instances constraint?
*/
@SuppressWarnings("rawtypes")
korablin marked this conversation as resolved.
Show resolved Hide resolved
public static boolean shouldNodeBeRetained(AbstractCloudComputer c) {
// Checks only idle nodes
vSphereCloudSlave node = (vSphereCloudSlave) c.getNode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An AbstractCloudComputer could be anything, so its node might not be a vSphereCloudSlave and hence this could blow up with a class-cast exception.
You could return false if it isn't "one of ours... or you could change the declared type of argumentc` to be a vSphere-specific object so it'll always be "one of ours".

if (node == null) return false;
vSphereCloudSlaveTemplate nodeTemplate = node.getTemplate();
// nodeTemplate might be null if the template was manually deleted from a cloud
if (nodeTemplate == null) return false;
int instancesMin = nodeTemplate.getInstancesMin();
if (instancesMin > 0) {
int maxNodes = Math.min(nodeTemplate.getTemplateInstanceCap(), nodeTemplate.getParent().getInstanceCap());
int runningNodesTotal = nodeTemplate.getOnlineNodes().size();
int idleNodesTotal = nodeTemplate.getIdleNodes().size();
int reusableBusyNodesTotal = nodeTemplate.getBusyReusableNodes().size();
if ((instancesMin >= (idleNodesTotal + reusableBusyNodesTotal - 1)) && (runningNodesTotal <= maxNodes)) {
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm reading the code incorrectly here, but it looks to me like this code doesn't distinguish between the order in which nodes were created and will give the same answer about all of them.
e.g. lets say we've got instancesMin=3, instanceCap=8 and we've got 5 instances that exist. If this code is called on all 5 in quick succession, it would answer "this is not needed" to all (causing all 5 to be disposed of) or "this is needed" (causing all 5 to be retained).
What we actually need is for this code to figure out which 3 of the 5 instances should be retained.

}
}
return false;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,10 @@
<f:textbox clazz="required number" default="0"/>
</f:entry>

<f:entry title="${%Instances Min}" field="instancesMin">
<f:textbox clazz="required number" default="0"/>
</f:entry>

<f:entry title="${%# of Executors}" field="numberOfExecutors">
<f:textbox clazz="required positive-number" default="1"/>
</f:entry>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
<div>
The number of VMs to be provisioned beforehand.<br>
This allows to speed up CI runs by starting them immediately without waiting for a VM to get booted.<br>
<dt>If the number set to 0:</dt>
<dd>No VMs provisioned in advance.</dd>
<dt>If the number is bigger than 0:</dt>
<dd>The plugin provisions new VMs to meet the value.</dd>
<dt>If Keep-Until-Idle retention strategy is used along with this option:</dt>
<dd>Busy VMs are counted as reuseable, so the plugin will not provision more VMs even if all of them are currently busy.</dd>
<dt>If Run-Once retention strategy is used along with this option:</dt>
<dd>Busy VMs are counted as non reusable, so the plugin will provision more VMs to meet the value.<br>
When at maximum capacity new VM will be added after an old VM is gone.</dd>
<dt>If instances Min is bigger than instance Cap:</dt>
<dd>The plugin will provision max number of VMs specified in instance Cap (the smallest of cloud and template options).</dd>
The plugin checks the number of running VMs once in 2 minutes.
</div>
Original file line number Diff line number Diff line change
Expand Up @@ -316,7 +316,7 @@ private CloudProvisioningRecord createInstance(int capacity, int provisioned, in

private static vSphereCloudSlaveTemplate stubTemplate(String prefix, int templateInstanceCap) {
return new vSphereCloudSlaveTemplate(prefix, "", null, null, false, null, null, null, null, null, null, templateInstanceCap, 1,
null, null, null, false, false, 0, 0, false, null, null, null, new JNLPLauncher(),
null, null, null, false, false, 0, 0, false, null, null, 0, null, new JNLPLauncher(),
RetentionStrategy.NOOP, null, null);
}

Expand Down
Loading