Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore alternative options for mkcloud deployment #1129

Open
2 of 7 tasks
Itxaka opened this issue Aug 8, 2016 · 23 comments
Open
2 of 7 tasks

Explore alternative options for mkcloud deployment #1129

Itxaka opened this issue Aug 8, 2016 · 23 comments

Comments

@Itxaka
Copy link
Member

Itxaka commented Aug 8, 2016

Unfortunately setting up a test environment with mkcloud takes quite some time, and mkcloud only allows to have one snapshot at a time, which makes difficult to quickly test and iterate locally.

We should try to identify alternative routes for a faster mkcloud deployments for development purposes.

Some bullet points from my (still ignorant) point of view of mkcloud:

  • Minimising re-downloading of images (mkcloud should cache downloads in /var/cache #223)
  • Faster download of images (Need faster image download mechanism #1139)
  • Using docker containers?
    • Comment: This could help us test crowbar at a scale much easier locally.
  • Using pre-packaged images for the nodes (like the admin?)
    • Question: Is it necessary to pxe reinstall the nodes on each deployment?
  • Using pre-packaged images for nodes and admin with the software installed and only run the updater on deploy for faster deployment (removing the need to download extra packages)
  • Allowing to set a proxy on the nodes so any downloads may be already cached in the proxy

Looking for some comments here to see if its feasible or not and what else could be looked at.

Speeding up the product itself (not mkcloud)

  • Only run subset of roles in chef-client run instead of all roles (part of the crowbar orchestration epic) - this will be the biggest speed up by far, because currently applying each proposal repeats the same recipes over and over again.
@nicolasbock
Copy link
Contributor

Very good idea! Let me know if I can help in any way.

@toabctl
Copy link
Contributor

toabctl commented Aug 9, 2016

Using pre-packaged images for the nodes (like the admin?)
Question: Is it necessary to pxe reinstall the nodes on each deployment?

It's necessary to test the installation. But we already have a thing called "crowbar_register". So we could install the nodes with crowbar_register and just pxe boot a single node which is not used. I think that would speedup the installation.

@Itxaka
Copy link
Member Author

Itxaka commented Aug 9, 2016

It's necessary to test the installation. But we already have a thing called "crowbar_register". So we could install the nodes with crowbar_register and just pxe boot a single node which is not used. I think that would speedup the installation.

Sure, that will always be the case on jenkins, my line of thinking was more directed at a local setup which its easier to create/teardown.

Another option would be Vagrant + vagrant-libvirt and its ability to boot from different boxes, which we could have premade boxes remotely or locally for reuse + it supports pxe boot and different deploy methods.

https://github.com/vagrant-libvirt/vagrant-libvirt

@nicolasbock
Copy link
Contributor

Sure, that will always be the case on jenkins, my line of thinking was more directed at a local setup which its easier to create/teardown.

That's exactly the use case I am interested in as well. A quick way to deploy a test cloud locally. Right now a full setup with mkcloud takes quite a bit of time.

@aspiers
Copy link
Contributor

aspiers commented Aug 9, 2016

Big +1 for submitting this issue and any subsequent efforts to address it! I really dislike that we have an internal SUSE-only deployment tool which reinvents many wheels badly :-( But it's there for historical reasons.

Please see https://github.com/SUSE-Cloud/suse-cloud-vagrant/ which has been around for a long time but needs new boxes built for SOC6 and 7. (If you are interested I can explain how to get this working automatically in IBS - don't even think about doing it manually as all the heavy lifting is already done.)

@Itxaka
Copy link
Member Author

Itxaka commented Aug 11, 2016

Also, having local repos has not worked for me since the start :(
Would be good to have some docs on that.

@aspiers
Copy link
Contributor

aspiers commented Aug 11, 2016

I've submitted #1139. Let's submit a github issue for each area of potential improvement, and link to it from this "master" issue.

@aspiers
Copy link
Contributor

aspiers commented Aug 11, 2016

@Itxaka wrote:

Also, having local repos has not worked for me since the start :(
Would be good to have some docs on that.

I guess you are using sync-repos? Please could you file a new issue describing what doesn't work? Probably best to file it in the repo which contains sync-repos.

@Itxaka
Copy link
Member Author

Itxaka commented Aug 11, 2016

I guess you are using sync-repos? Please could you file a new issue describing what doesn't work? Probably best to file it in the repo which contains sync-repos.

Good point, will do, althougth I feel that is a local issue more than a sync-repos issue, I will create an issue requesting more docs/examples

@aspiers
Copy link
Contributor

aspiers commented Aug 11, 2016

I extended your description with another area for significant improvement.

@dguitarbite
Copy link
Contributor

Following up from today's cloud meeting, I have sanitized my scripts and created the PR, it should address some problems for the time being. Could we also discuss on this PR? #1145

@nicolasbock
Copy link
Contributor

nicolasbock commented Sep 8, 2016

@Itxaka and other remote workers: How long does it take you guys right now to deploy a cloud? I haven't carefully timed it but it seems that mkcloud{1,2} take about an hour for mkcloud plain. And what kind of hardware are you using?

@Itxaka
Copy link
Member Author

Itxaka commented Sep 9, 2016

@nicolasbock For me it takes about 40 minutes to deploy a cloud6 env. And that is having the isos and admin qcow image already downloaded locally and hijacking mkcloud to make it use them.
For a susecloud7 it can take up to 1:20 hours due having to re-download the iso image like 2 different times (not as easy to hijack as cloud6) ¯_(ツ)_/¯

Hardware: i7 3.8Ghz, 20Gb ram, 512Gb SSD. I would say that in my case, the bandwidth is a huge bottleneck. Only in downloads I spend around 40 minutes :(

@nicolasbock
Copy link
Contributor

@Itxaka That's pretty bad 😦. I was hoping you'd tell me it's like 10 minutes or something like that 😉. And the hardware you are using is pretty fast too...

@Itxaka
Copy link
Member Author

Itxaka commented Sep 9, 2016

@nicolasbock yeah 😭

Probably one of the huge timesinks is the pxe install of both nodes. Would love to be able to m odify that so you get the pxe install only with a flag or something like that and otherwise you get both nodes with a pre-made qcow file with the system already installed. No point into reinstalling the nodes everytime :O

@aspiers
Copy link
Contributor

aspiers commented Sep 9, 2016

Isn't this already possible with the setuplonelynodes and crowbar_register steps? (BTW "lonely node" is not a good name for this feature - we should really change it.)

@aspiers
Copy link
Contributor

aspiers commented Sep 9, 2016

BTW 40 mins sounds about right for me too.

@Itxaka
Copy link
Member Author

Itxaka commented Sep 9, 2016

Isn't this already possible with the setuplonelynodes and crowbar_register steps? (BTW "lonely node" is not a good name for this feature - we should really change it.)

No idea, I did a couple of tests with that option but could not find out what they were there for or what to do after :D

@nicolasbock
Copy link
Contributor

It'd be interesting to find out where the deployment process spends all it's time. I am thinking kind of like bootchart does it.

@bmwiedemann
Copy link
Member

When working remotely, you should really mirror the relevant isos from rsync clouddata.nue.suse.com::cloud/images/ and repos/
and rsync dist.suse.de::repos/SUSE:/SLE-12-SP2:/Update:/Products:/Cloud7/images/iso/
and set the clouddata, distsuse and susedownload variables

... and maybe we even want to replace our wget with above rsync?

@Itxaka
Copy link
Member Author

Itxaka commented Sep 9, 2016

@bmwiedemann https://github.com/SUSE/cloud/issues/141 this one is for docs of sync-repos 😉, care to open a new one to document the rsync + clouddata, distsuse and susedownload variables for remote users?

EDIT: Got sent before finished :O

@dguitarbite
Copy link
Contributor

I am having trouble to see how containerization of mkcloud would make things faster and also scalable. Although the argument is not against using containers but rather more towards the fact about the design of crowbar itself which makes it difficult to visualize the architecture using micro-services (at least for me).

In my head, implementing this point while fixing Crowbar itself would rather be an enhancement for Crowbar. Otherwise, keeping things in Virtual Machines should be more beneficial.

@bmwiedemann
Copy link
Member

@Itxaka it is currently in a rather half-finished state, because it is not fully clear how people would want to use it.
E.g. for my "detached intel NUC cloud" use-case I would like to run something like mkcloud mirror command that pulls required repos+isos to the gate/host (of the crowbar admin node) and maybe writes example apache-config and nfs-exports files for me to move to /etc/ ... or does it right away (but some people might not like the script messing with their system).
That mirror step would take $cloudsource into account to find out which repos need to be mirrored, but for that to work, we need to split out that information into a separate function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants